Automatic Lyrics Transcription of Polyphonic Music With Lyrics-Chord Multi-Task Learning

2022 
Lyrics are the words that make up a song, while chords are harmonic sets of multiple notes in music. Lyrics and chords are generally essential information in music, i.e. unaccompanied singing vocals mixed with instrumental music, representing important components in polyphonic music. In a traditional lyrics transcription task, we first extract the singing vocals from the polyphonic music and then transcribe the resulting singing vocals, where the two steps are optimized independently. In this paper, we propose novel end-to-end network architectures that are designed to disentangle lyrics from chords in polyphonic music for effective lyrics transcription in a single step, where we consider chords as musical words, analogously to lexical words as lyrics intuitively. We start by studying a single-task lyrics transcriber as the reference baseline and the initial model to develop the multi-task lyrics transcription solutions. The main idea is to take advantage of chord transcription available in the training data through multi-task training to improve lyrics transcription. The experiments show that the proposed multitask lyrics transcriber significantly outperforms other competing solutions, with a word error rate (WER) of 31.82% on a standard test dataset.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    89
    References
    0
    Citations
    NaN
    KQI
    []