Application of Noise Reduction Techniques to Improve Speaker Verification to Multi-Speaker Text-to-Speech Input

2022 
Text-to-speech is a very common implementation in the modern world. Its use is everywhere, from hearing aid to a virtual assistant. But the development of voice models for TTS involves a lot of sample speech from professional speakers. Voice cloning can reduce this development cost by generating artificial voices from small speech samples. Speaker verification to multi-speaker text-to-speech (SV2TTS) makes this possible with its three individual neural networks and a lot of speech data. But it is still not possible to use it casually because of the noises around us. Noise creates garbage data while being trained and that makes the output less desirable. We propose to add a noise reduction system to the recorder of SV2TTS to reduce noise from speech data and create a more desirable output from SV2TTS. We compared six noise reduction algorithms and applied the best-performing one to the SV2TTS. We intend to expand this research to implement SV2TTS for the Bengali language.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    9
    References
    0
    Citations
    NaN
    KQI
    []