Using long-term information to improve robustness in Speaker Identification

James Lyons,James O'Connell,Kuldip Kumar Paliwal

Using long-term information to improve robustness in Speaker Identification

2010

James Lyons
James O'Connell
Kuldip Kumar Paliwal

In this paper we propose two new methods of improving the robustness of Automatic Speaker Identification systems. These methods rely on using long-term information in the speech signal to improve the robustness of the features. The first method involves averaging filterbank parameters from consecutive short-time frames over a longer window. The second method investigates the use of frame lengths longer than generally assumed stationary. We show that these two methods result in an improvement over standard Mel Frequency Cepstral Coefficients in the presence of additive white Gaussian noise in speaker identification applications. Furthermore, additional improvements are observed at mid-range SNR when the proposed methods are used in combination.

Keywords:

Signal processing
Additive white Gaussian noise
Speaker recognition
Filter bank
Speech recognition
Robustness (computer science)
Signal-to-noise ratio
Mel-frequency cepstrum
Artificial intelligence
Computer science
Pattern recognition
Acceleration

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations