Bengali Abstractive News Summarization Using Seq2Seq Learning with Attention

Text summarization is the technique for generating short and succinct summaries from long texts that focuses on the most important information but keeps the overall exhaustive signification of the whole text. This paper presents a method of generating short abstractive summaries from long Bengali news articles using some basic NLP approaches with various Recurrent Neural Network (RNN) architectures, such as Bidirectional RNN, Encoder-Decoder RNN, Sequence to Sequence (Seq2Seq) Learning with Attention mechanism, Longest Short-term Memory (LSTM), etc. Dataset collected by the authors from different Bengali online newspapers is used here. Then, the dataset is preprocessed. After that, word embedding and vocabulary counts are done. Finally, the deep learning approaches are applied to generate an abstractive summary for each news article. The model used the Seq2Seq algorithm with an attention mechanism which reduced the training loss up to 0.001 successfully. Another existing dataset is used to evaluate the results and found satisfactory results for own dataset than the other.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader