Pitch Estimation by Multiple Octave Decoders

Yael Segal,May Arama-Chayoth,Joseph Keshet

Pitch Estimation by Multiple Octave Decoders

2021

Pitch estimation is an essential task in audio processing due to its key role in many speech and music applications. Still, accurately predicting a continuous value from a high range of pitch frequencies is a challenging task. Inspired by the success of signal processing filterbank methods, we propose a novel deep architecture for accurate pitch estimation. The proposed method is composed of an encoder and multiple decoders. The encoder is implemented by a convolutional neural network that provides a good representation of the raw audio signal, and its output is fed into a set of decoders. Each decoder predicts the pitch value within a specific frequency band and is implemented by a fully-connected neural network. Such a construction allows each decoder to specialize in a particular frequency regime, which turns into a more accurate estimation of pitch values for music and speech signals.

Keywords:

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations