Automatic speech recognition fusion approach to unsupervised speaker clustering and labeling

2006 
This paper describes a fully unsupervised approach to speaker clustering and labeling employing speech recognition (ASR) technology to bootstrap speaker identification (SID). An algorithm that combined these two technologies was able to correctly cluster and label 299 NATO ship-to-ship transmissions with an accuracy of 89% in an on-line (no a priori training) scenario. This fusion approach out-performed ASR alone by 23.6%, and outperformed manually-trained VQ-SID by 12.7% and GMM/UMB-SID by 8.6%. This paper demonstrates that, under certain circumstances, unsupervised, self-organizing systems can be more effective than manually-trained ones.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    7
    References
    0
    Citations
    NaN
    KQI
    []