Introducing the RODeCAR Database for Deceptive Speech Detection.

2019 
This work introduces the Romanian Deva Criminal Investigation Audio Recordings (RODeCAR) database, a dataset of truthful and untruthful speech, constructed by analyzing, processing, and cross-examining archived original criminal investigation recordings. The most important advantage to be leveraged when using this database is the casework nature of the content, i.e. all the speakers were suspects or witnesses in real criminal investigations, and all interactions are spontaneous and are part of actual law enforcement activity. Around 5 hours of recorded material have been extracted from archives and processed. With a total number of 19 speakers (4 female, 15 male), 39% of the content is objectively annotated as untruthful, in a global sense. Using a standard framing scheme (25 ms frames with 15 ms overlap), the 1,225,464 available segments would create a large and reliable dataset.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    21
    References
    0
    Citations
    NaN
    KQI
    []