Speaker height estimation combining GMM and linear regression subsystems

Keri Williams,John H. L. Hansen

Speaker height estimation combining GMM and linear regression subsystems

2013

Keri Williams
John H. L. Hansen

There are both scientific and technology based motivations for establishing effective speech processing algorithms that estimate speaker traits. Estimating speaker height can assist in voice forensic analysis [1], as well as provide additional side knowledge to improve speaker ID systems, or acoustic model selection for improved speech recognition. In this study, two distinct approaches for height estimation are explored. The first approach is statistical based and incorporates acoustic models within a GMM structure, while the second is a direct speech analysis approach that employs linear regression to obtain the height directly. The accuracy and trade-offs of these systems are explored as well a fusion of the two systems using data from the TIMIT corpus (which includes ground truth on speaker height).

Keywords:

Speech recognition
Regression analysis
Speaker diarisation
Speech processing
Speaker recognition
Acoustic model
Artificial intelligence
Computer science
Direct speech
Linear regression
Ground truth
Pattern recognition
TIMIT

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations