Prosodic addressee-detection: ensuring privacy in always-on spoken dialog systems.

2020 
We analyze the addressee detection task for complexity-identical dialog for both human conversation and device-directed speech. Our recurrent neural model performs at least as good as humans, who have problems with this task, even native speakers, who profit from the relevant linguistic skills. We perform ablation experiments on the features used by our model and show that fundamental frequency variation is the single most relevant feature class. Therefore, we conclude that future systems can detect whether they are addressed based only on speech prosody which does not (or only to a very limited extent) reveal the content of conversations not intended for the system.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    6
    References
    1
    Citations
    NaN
    KQI
    []