USTC-NELSLIP System Description for DIHARD-III Challenge.

Yuxuan Wang,Mao-Kui He,Shutong Niu,Lei Sun,Tian Gao,Xin Fang,Jia Pan,Jun Du,Chin-Hui Lee

USTC-NELSLIP System Description for DIHARD-III Challenge.

2021

Yuxuan Wang
Mao-Kui He
Shutong Niu
Lei Sun
Tian Gao
Xin Fang
Jia Pan
Jun Du
Chin-Hui Lee

This system description describes our submission system to the Third DIHARD Speech Diarization Challenge. Besides the traditional clustering based system, the innovation of our system lies in the combination of various front-end techniques to solve the diarization problem, including speech separation and target-speaker based voice activity detection (TS-VAD), combined with iterative data purification. We also adopted audio domain classification to design domain-dependent processing. Finally, we performed post processing to do system fusion and selection. Our best system achieved DERs of 11.30% in track 1 and 16.78% in track 2 on evaluation set, respectively.

Keywords:

Speech recognition
Cluster analysis
Set (abstract data type)
Computer science
Domain (software engineering)
Voice activity detection
Speaker diarisation

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations