Detecting Aggression in Voice Using Inverse Filtered Speech Features

2018 
In social interactions, Aggression is a behavior towards another individual with the motive of causing physical or psychological damage. In humans, anger or disgust due to obstructs in achieving certain goals causes aggression. This article proposes an automatic method for detection of aggression using features extracted from pressure distribution in vocal tract during voiced speech. The variations in air pressure distribution across different sections of vocal tract have been computed from the speech signal by inverse estimation. A Hidden Markov Model has been trained to classify these air pressure variations as Aggression or Calm. The system has been tested on a set of audio clips extracted from interviews of political personalities on television shows. A total of 120 audio clips with an overall duration of around 30 minutes were collected from three different speakers based on human perception of Aggression and Calm. The clips were rated by human raters to discard perceptually ambiguous clips. The speaker dependent system was trained using 40 percent of the data of each speaker and was tested with the remaining 60 percent of the data of the same speaker. The system was able to detect aggression with about 93 percent accuracy.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    30
    References
    3
    Citations
    NaN
    KQI
    []