Singing-voice synthesis using demi-syllable unit selection

2016 
In this study, an algorithm having a nice dynamic-programming structure is proposed for unit selection. This algorithm considers the costs of pitch and duration transformations, and the costs of contextual and spectral discontinuities. Here, the voice unit, demi-syllable, is adopted. In the training phase, each demi-syllable unit is analyzed to obtain a sequence of discrete cepstral coefficient (DCC) vectors. Then, in the synthesis phase, the pitch and duration of a syllable can be adjusted. In addition, the singing voice signals are synthesized with harmonic plus noise model (HNM). To evaluate the performance of our unit selection algorithm, we have conducted two listening tests. One test is to evaluate the spectral fluency (continuity), and another test is to evaluate the synthesized songs' quality. The results of both tests show that our algorithm can improve a synthesized song's fluency level and quality noticeably.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    11
    References
    0
    Citations
    NaN
    KQI
    []