Modelling speaker-size discrimination with voiced and unvoiced speech sounds based on the effect of spectral lift

2021 
Abstract We can estimate the size of a speaker solely from their speech sounds, regardless of whether the sounds are voiced or unvoiced. In this study, we developed a size perception model based on the computational theory of the stabilised wavelet transform (SWT) to explain a variety of size discrimination data. We also conducted extended experiments to evaluate the effect of spectral lift on speaker size discrimination, from voiced and unvoiced speech sounds. The just noticeable difference (JND) and the point of subjective equality (PSE) for speaker size discrimination were compared between speech sounds with natural and lifted spectra. On average, listeners tended to judge that the lifted speech came from a smaller speaker. The PSE, which indicates the systematic difference in perceived size, shifted by approximately 10% (Exp. 1) for unvoiced speech sounds, and by approximately 5% (Exp. 2) for voiced speech sounds. The JND depended on the spectral lift for unvoiced sounds, but not with voiced sounds. At the same time, it was noted that there were large differences between listeners: some listeners’ judgments were affected by the spectral lift, while others were not. We constructed a size discrimination model to explain all of the experimental results with listener dependence for voiced and unvoiced speech sounds. We introduced a weighting function, based on the Size-Shape Image (SSI) in the SWT, which reduces the influence of resolved harmonics caused by the glottal pulse sequence in voiced speech. As a result, the model with the SSI weighting function fairly predicted the individual listener’s data, whether the judgments were affected by the spectral lift or not, and whether the speech sounds were voiced or unvoiced. The optimum choice of one parameter, that is, the spectral compensation coefficient, enabled us to explain the data of all individuals.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    42
    References
    0
    Citations
    NaN
    KQI
    []