Towards Hierarchical Prosodic Prominence Generation in TTS Synthesis

Leonardo Badino,Robert A. J. Clark,Mirjam Wester

Towards Hierarchical Prosodic Prominence Generation in TTS Synthesis

2012

We address the problem of identification (from text) and generation of pitch accents in HMM-based English TTS synthesis. We show, through a large scale perceptual test, that a large improvement of the binary discrimination between pitch accented and non-accented words has no effect on the quality of the speech generated by the system. On the other side adding a third accent type that emphatically marks words that convey ”contrastive” focus (automatically identified from text) produces beneficial effects on the synthesized speech. These results support the accounts on prosodic prominence that consider the prosodic patterns of utterances as hierarchical structured and point out the limits of a flattening of such structure resulting from a simple accent/non-accent distinction. Index Terms: speech synthesis, HMM, pitch accents, focus detection

Keywords:

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations