AI Syncs Song Lyrics

Find Saas Video Reviews — it's free
Saas Video Reviews
Makeup
Personal Care

AI Syncs Song Lyrics

Table of Contents

  1. Introduction
  2. The Inspiration Behind the Project
  3. Challenges Faced in Syncing Song Lyrics
    1. Poorly Timed Lyrics on Spotify
    2. Initial Attempts to Isolate Vocals
    3. Discovering the Spleeter AI Model
    4. Using the Whisper Speech-to-Text Library
  4. The Whisper Library and its Accuracy
  5. Syncing Lyrics with Text Similarity
    1. Using the Jaccard Similarity Function
    2. Matching Lyrics with Whisper Segments
  6. The Process of Syncing Lyrics
    1. Comparing Similarity Between Segments
    2. Removing Synced Lyrics from the Unsynced Pile
    3. Repeating the Process for All Segments
  7. The Result: An LRC File
    1. Previewing the Result with Mini Lyrics
    2. Possible Applications and Limitations
  8. Trying the Syncing Process Yourself
    1. Using the GitHub Repo to Try the Project
    2. Providing Feedback and Sharing Results
  9. Conclusion

Automatically Syncing Song Lyrics Using AI: A Step-by-Step Guide

Hey everyone, it's Ryan! Welcome back to The Syntax Bite. In this video, I'm going to take you through the process of how I used AI to automatically sync song lyrics. It was a fascinating project that came to life during the holidays when I noticed that Spotify didn't have built-in functionality for synced lyrics. This intrigued me, so I decided to explore the idea further.

The Inspiration Behind the Project

During the holiday season, I found myself listening to a lot of music on Spotify. However, I soon realized that the lyrics provided by Spotify were often poorly timed, causing a slight disconnect between the music and the words. This surprised me because I initially assumed that the lyrics were AI-generated and wouldn't be perfect. However, as it turned out, there was room for improvement.

To test my idea, I chose the song "Evacuate the Dance Floor" by Cascada. The unsynced lyrics highlighted the need for a solution. This song became the foundation of my testing and motivated me to explore the possibility of using AI to sync lyrics.

Challenges Faced in Syncing Song Lyrics

Syncing song lyrics using AI posed several challenges. The first hurdle was finding a way to isolate the vocals in a song. I believed that isolating the vocals would make it easier for speech-to-text models to accurately understand the lyrics. My initial approach involved exploring methods in audio tools like Audacity to achieve this isolation. Additionally, I was hopeful of finding a Python library that could replicate this strategy.

Fortunately, I stumbled upon Spleeter by Deezer—a Python AI model specifically designed to automatically separate vocals from music. This revelation eliminated the need for me to create a custom solution and allowed me to focus on the larger goal of syncing lyrics.

The next challenge was finding a suitable speech-to-text library. I discovered Whisper, a library developed by OpenAI—the same organization behind the widely-known GPT chat model. Whisper provided the functionality I needed to convert audio segments into text. The library offered multiple models, ranging from Tiny to Large, each with varying accuracy and RAM requirements.

The Whisper Library and its Accuracy

Starting with the base model, one step up from Tiny, I encountered some amusing and far-from-perfect results. At this stage, the lyrics were barely recognizable and required further refinement. Moving up to the Medium model, however, yielded significantly better results. While not perfect, the lyrics were now recognizable and could be used to match the user-provided lyrics. It was clear that this approach was feasible, at least on a basic level.

Apart from accurate transcription, Whisper had another valuable feature for this application—it automatically split audio into segments. Leveraging this segmentation, I could use the timestamps provided by Whisper to sync the lyrics with the user-provided lyrics. To achieve this, I opted for a simple Jaccard similarity function that I discovered in the Python Data Analysis book. This function, written in vanilla Python, was efficient and provided satisfactory results.

Syncing Lyrics with Text Similarity

To sync the lyrics, the code compared the similarity between the speech-to-text generated segments and the unsynced lyrics. The process involved gradually adding lines from the unsynced lyrics to create different samples. Each sample represented an increasing number of lines from the lyrics. Comparing these samples to the speech-to-text generated segments helped identify the most similar sample.

The sample with the highest similarity was assigned the start time of the segment, and those lyrics were removed from the unsynced pile. This process continued until all segments were matched and synced with the lyrics. Any remaining lyrics were tagged with the final timestamp. Ultimately, the code produced an LRC file—a simple text-based file format for lyrics syncing.

The Process of Syncing Lyrics

The syncing process involved several iterative steps. The code compared the similarity between each speech-to-text generated segment and the unsynced lyrics, gradually building up the synced lyrics sample by sample. This process continued until all segments were successfully matched with lyrics.

The power of the Jaccard similarity function, coupled with Whisper's segmentation, allowed for a relatively efficient sync. However, it's important to note that this process did not provide exact line-by-line syncing, as found on Spotify. In theory, a more precise syncing method could be implemented by dividing the time between start timestamps evenly. Nevertheless, the current approach provides satisfactory results for most songs.

The Result: An LRC File

The result of the syncing process is an LRC file. This file format, commonly used for lyrics syncing, allows for easy previewing of the synced lyrics. I personally used Mini Lyrics, a plugin for Windows Media Player, to visualize the lyrics syncing. However, there are other programs available that can read the same file format.

Overall, I was impressed with the results. While there were minor difficulties with repetitive lyrics towards the end of songs, the synced lyrics were perfectly usable. Compared to the lyrics syncing currently available on Spotify, this approach proved to be on par, if not better.

It's worth noting that while Spotify and other music platforms could potentially implement a similar system, copyright restrictions pose a challenge. However, for users interested in trying the syncing process themselves, I've provided a link to the GitHub repo in the description below. With just the song lyrics and an MP3 file, you can try out the project and see how it works with your own music.

Trying the Syncing Process Yourself

If you're curious about syncing song lyrics using AI, I encourage you to try the project yourself. The GitHub repository contains all the necessary code and instructions. I would love to hear your feedback and find out which songs you tried and how well the sync worked. Your input can help improve and refine the syncing process further.

In conclusion, if I was able to achieve this level of lyrics syncing in just a few hours, why can't Spotify and other music platforms implement a similar solution? It seems like a practical approach that could enhance the overall music listening experience. Thank you for watching, and if you found this video interesting, don't forget to like and subscribe to the channel. Stay tuned for more exciting projects in the future!

Are you spending too much time on makeup and daily care?

Saas Video Reviews
1M+
Makeup
5M+
Personal care
800K+
WHY YOU SHOULD CHOOSE SaasVideoReviews

SaasVideoReviews has the world's largest selection of Saas Video Reviews to choose from, and each Saas Video Reviews has a large number of Saas Video Reviews, so you can choose Saas Video Reviews for Saas Video Reviews!

Browse More Content
Convert
Maker
Editor
Analyzer
Calculator
sample
Checker
Detector
Scrape
Summarize
Optimizer
Rewriter
Exporter
Extractor