Concept-based Topic Attention for a Convolutional Sequence Document Summarization Model

2021 
Neural network-based document summarization often suffers from the problem of summarizing irrelevant topic content regarding the main idea. One of the main reasons leading to this problem is a lack of human common-sense knowledge which generates facts that are not decipherable. We propose a document summarization framework called Document Summarization with Concept-based Topic Triple Attention (DOSCTTA). The framework incorporates concept-based topic information into a convolutional sequence document summarization model. We propose a concept-based topic model (CTM) to generate semantic topic information using conceptual information or knowledge which is retrieved from a knowledge base. We introduce a triple attention mechanism (TAM) to not only measure the importance of each topic concept and source element to the output elements but also the importance of the topic concept to the source element. TAM presents contextual information from three aspects and then combines them using a softmax activation to acquire the final probability distribution to enable the model to produce coherent and meaningful summaries with a wide range of rich vocabulary. The experimental evaluations which are conducted over the Gigaword and CNN/Daily Mail (CNN/DM) datasets reveal that DOSCTTA surpasses the various widely recognized state-of-the-art models (WSOTA) such as Seq2Seq, PGEN, CSM and TopicCSM. DOSCTTA achieves competitive results by generating coherent and informative summaries.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    16
    References
    0
    Citations
    NaN
    KQI
    []