MARY TTS unit selection and HMM-based voices for the Blizzard Challenge 2013

2013 
This paper describes the implementation of a unit selection English voice and a HMM-based Hindi voice for our participation in the Blizzard Challenge 2013. The two voices have been created using the MARY TTS voice building framework. We describe how audiobook data is used to create the English voice and how a quality control measure (statistical model cost) is used to control the selection of unit candidates, in addition to target and join costs. The implementation of the Hindi voice and the new Hindi language components in the MARY TTS framework are also described. We have obtained close to average results for both systems, especially in the emotion category for the English voice, Naturalness for the Hindi voice and Word Error Rate (WER) for both systems. Index Terms: speech synthesis, unit selection, join cost, multilingual, open source
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    10
    References
    3
    Citations
    NaN
    KQI
    []