Unsupervised abstractive summarization via sentence rewriting

2023 
Unsupervised extractive summarization aims to extract salient sentences from the document without labeled corpus. Existing methods have achieved promising progress, thanks to the power of large-scale pre-trained language models and high-quality contextualized representations. However, extractive summaries often fail to maintain smooth transitions between sentences and struggle to form a coherent and fluent text due to splicing of sentences. Nevertheless, to the best of our knowledge, very few studies currently focus on unsupervised abstractive summarization. Inspired by the intuitive human process of writing summaries, which involves extracting salient sentences first and then reconstructing them, in this paper, we propose an Extract-then-Abstract framework to generate more coherent and human-like summary. Specifically, we first adopt extractive summarization model as summarizer to generate extractive summary in the extraction stage. Then in the abstraction stage, we propose a BART-based sentence write model to generate more coherent and fluent abstractive summary. To this end, we design a novel parallel data creation method for our rewrite model by proposing an effective sentence sampling strategy without any manual annotation cost. Extensive experiments including automatic evaluation and human evaluation demonstrate that our framework consistently outperforms strong baselines for unsupervised abstractive summarization and can generate more coherent and human-like summary while maintaining in competitive ROUGE scores for unsupervised extractive summarization.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    0
    Citations
    NaN
    KQI
    []