Identification of Palatal Fricative Fronting Using Shannon Entropy of Spectrogram

2019 
In this paper, an attempt has been made to identify palatal fricative fronting in children speech, where postalveolar /sh/ is mispronounced as dental /s/. In children’s speech, the concentration of energy (darkest part) of spectrogram for /s/ ranges 4000 Hz to 8000 Hz, whereas it ranges 3000 Hz 8000 Hz for /sh/. Gammatonegram follows the frequency subbands of the ear (wider for higher frequencies). Various spectral properties such as spectral centroid, spectral crest factor, spectral decrease, spectral flatness, spectral flux, spectral kurtosis, spectral spread, spectral skewness, spectral slope and Shannon entropy of the spectrogram (interval of 2000 Hz), extracted from the Gammatonegram are proposed for the characterization of /sh/ and /s/. The dataset recorded from 60 native Kannada speaking children of age between 3 1/2 to 6 1/2 years is considered for the analysis from NITK Kids’ Speech Corpus. Support vector machine (SVMs) is considered for the classification. Various combinations of the proposed features are considered for the evaluation, along with the MFCCs(39) and LPCCs(39). Combination of MFCCs(39), LPCCs(39) and Entropy(4) is observed to achieve highest mispronunciation identification performance of 83.2983%.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    26
    References
    0
    Citations
    NaN
    KQI
    []