Improving Unsupervised Extractive Summarization by Jointly Modeling Facet and Redundancy

2022 
Unsupervised extractive summarization aims to extract salient sentences from documents without labeled corpus. Existing methods are mostly graph-based by computing sentence centrality. These methods have two main problems: facet bias and redundant problems. Facet bias problem leads summarization models tend to select sentences within the same facet, which often leads to the ignoring of other vital facets, especially on long-document and multi-documents. First, to address the facet bias problem, we proposed a novel Facet-Aware centrality-based Ranking model (FAR). We let the model pay more attention to different facets by introducing a sentence-document weight. The weight is added to the sentence centrality score. FAR can alleviate redundancy to some extent. Then, to further reduce redundancy, we proposed a novel Redundancy- and Facet-Aware Ranking model (RFAR) which jointly models facet and redundancy by incorporating Determinantal Point Process (DPP) into the previous proposed FAR. We evaluate our FAR and RFAR on a wide range of summarization tasks that include 8 representative benchmark datasets. Experimental results show that FAR and RFAR consistently outperforms strong baselines, especially in long- and multi-document scenarios, and even perform comparably to some supervised models. Besides, we find that our methods can alleviate the position bias problem.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    60
    References
    0
    Citations
    NaN
    KQI
    []