I-Vector Transformation Using K-Nearest Neighbors for Speaker Verification

2020 
Probabilistic Linear Discriminant Analysis (PLDA) is the most efficient backend for i-vectors. However, it requires labeled background data which can be difficult to access in practice. Unlike PLDA, cosine scoring avoids speaker-labels at the cost of degrading the performance. In this work, we propose a post processing of i-vectors using a Deep Neural Network (DNN) to transform i-vectors into a new speaker vector representation. The DNN will be trained using i-vectors that are similar to the training i-vectors. These similar i-vectors will be selected in an unsupervised manner. Using the new vector representation, we will score the experimental trials using cosine scoring. The evaluation was performed on the speaker verification trials of VoxCeleb-1 database. The experiments have shown that with the help of the similar i-vectors the new vectors become more discriminative than the original i-vectors. The new vectors have gained a relative improvement of 53% in terms of EER, compared to the conventional i-vector/PLDA system, but without using speaker labels.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    23
    References
    3
    Citations
    NaN
    KQI
    []