Towards Blind Audio Quality Assessment using a Convolutional-Recurrent Neural Network

2021 
Blind estimation of audio quality is desired for practical applications since the original reference audio signal is sometimes unavailable. The subjective quality degradation of the audio signals can be caused by the low bitrate compression and multiple compression during the content submission and distribution stages. Existing methods have been proposed to classify the audio signals by the encoding bitrates with the informed audio codec name as well as the estimated MDCT framing grid and window type sequences during the encoding stage. In this work, a convolutional-recurrent neural network is proposed to perform blind AAC bitrates classification. Compared to the existing methods, the proposed method can perform the AAC bitrate classification directly from the MDCT coefficients without any prior knowledge of the encoding framing grid and window type sequences. The proposed method is further extended to perform multi-codec bitrate-related perceptual audio quality classification, which has not been extensively studied in the existing literature. For the AAC bitrate classification task, the evaluation results show the proposed method can achieve similar accuracy for most of the bitrates without using any framing grid information compared to the existing methods. For the multi-codec bitrate-related perceptual audio quality classification task, the proposed method can achieve 95% accuracy to classify three perceptual classes for the double compressed unseen data using three common audio codecs.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    17
    References
    0
    Citations
    NaN
    KQI
    []