Rhetorical Structure Modeling for Lecture Speech Summarization
2009
We propose an extractive summarization system with a novel non-generative probabilistic framework for speech summarization. One of the most under-utilized features in extractive summarization is rhetorical information -semantically cohesive units that are hidden in spoken documents. We propose Rhetorical-State Hidden Markov Models (RSHMMs) to automatically decode this underlying structure in speech. We show that RSHMMs give a 68.67% ROUGE-L F-measure, a 6.44% absolute increase in lecture speech summarization performance compared to the baseline system without using RSHMM. We further propose an enhanced Rhetorical-State Hidden Markov Model (RSHMM++) for extracting hierarchical structural summaries from lecture speech. We show that RSHMM++ gives a 72.01% ROUGE-L F-measure, a 3.34% absolute increase in lecture speech summarization performance compared to the baseline system without using rhetorical information. We also propose Relaxed DTW for compiling reference summaries.
Keywords:
- Correction
- Source
- Cite
- Save
- Machine Reading By IdeaReader
14
References
0
Citations
NaN
KQI