Predicting the effects of periodicity on the intelligibility of masked speech: An evaluation of different modelling approaches and their limitations

2019 
Four existing speech intelligibility models with different theoretical assumptions were used to predict previously published behavioural data. Those data showed that complex tones with pitch-related periodicity are far less effective maskers of speech than aperiodic noise. This so-called masker-periodicity benefit (MPB) far exceeded the fluctuating-masker benefit (FMB) obtained from slow masker envelope fluctuations. In contrast, the normal-hearing listeners hardly benefitted from periodicity in the target speech. All tested models consistently underestimated MPB and FMB, while most of them also overestimated the intelligibility of vocoded speech. To understand these shortcomings, the internal signal representations of the models were analysed in detail. The best-performing model, the correlation-based version of the speech-based envelope power spectrum model (sEPSMcorr), combined an auditory processing front end with a modulation filterbank and a correlation-based back end. This model was then modified to further improve the predictions. The resulting second version of the sEPSMcorr outperformed the original model with all tested maskers and accounted for about half the MPB, which can be attributed to reduced modulation masking caused by the periodic maskers. However, as the sEPSMcorr2 failed to account for the other half of the MPB, the results also indicate that future models should consider the contribution of pitch-related effects, such as enhanced stream segregation, to further improve their predictive power.Four existing speech intelligibility models with different theoretical assumptions were used to predict previously published behavioural data. Those data showed that complex tones with pitch-related periodicity are far less effective maskers of speech than aperiodic noise. This so-called masker-periodicity benefit (MPB) far exceeded the fluctuating-masker benefit (FMB) obtained from slow masker envelope fluctuations. In contrast, the normal-hearing listeners hardly benefitted from periodicity in the target speech. All tested models consistently underestimated MPB and FMB, while most of them also overestimated the intelligibility of vocoded speech. To understand these shortcomings, the internal signal representations of the models were analysed in detail. The best-performing model, the correlation-based version of the speech-based envelope power spectrum model (sEPSMcorr), combined an auditory processing front end with a modulation filterbank and a correlation-based back end. This model was then modified t...
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    9
    Citations
    NaN
    KQI
    []