Auto-Encoding Nearest Neighbor i-Vectors for Speaker Verification

2019 
In the last years, i-vectors followed by cosine or PLDA scoringtechniques were the state-of-the-art approach in speaker veri-fication. PLDA requires labeled background data, and thereexists a significant performance gap between the two scoringtechniques. In this work, we propose to reduce this gap by us-ing an autoencoder to transform i-vector into a new speaker vec-tor representation, which will be referred to as ae-vector. Theautoencoder will be trained to reconstruct neighbor i-vectors in-stead of the same training i-vectors, as usual. These neighbori-vectors will be selected in an unsupervised manner accordingto the highest cosine scores to the training i-vectors. The evalua-tion is performed on the speaker verification trials of VoxCeleb-1 database. The experiments show that our proposed ae-vectorsgain a relative improvement of 42% in terms of EER comparedto the conventional i-vectors using cosine scoring, which fillsthe performance gap between cosine and PLDA scoring tech-niques by 92%, but without using speaker labels
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    2
    Citations
    NaN
    KQI
    []