Research on Constructing Technology of Implicit Hierarchical Topic Network Based on FP-Growth

2019 
Topic extraction for books is of great significance in the development of intelligent reading systems, question answering systems and other applications. Compared with the theme of microblog and science and technology literature, the topic of book has the characteristics of multi-themes, hierarchization, networking, and information sharing. Therefore, the topic extraction of books must be more complicated and difficult. This article is based on solving the problems such as quick positioning of the relevant contents of the answer, cross-topic retrieval, and other issues in the intelligent reading system. Based on the topic trees extracted from the novel text chapters using the TF-IDF algorithm, the FP-GROWTH algorithm is used to mine the topic words. The association relationship, in turn, analyzes the hidden relationship between topics, and proposes and constructs an implicit hierarchical subject network (IHTN) of the novel text. The experimental results show that this method can completely extract the thematic network of novel texts, effectively extract the chapter relationships, significantly reduce the answer retrieval time in the question answering system, and improve the accuracy of the answers.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    12
    References
    0
    Citations
    NaN
    KQI
    []