A novel keyphrase extraction method by combining FP-growth and LDA

2017 
Fast-growing technologies like cloud-computing, big data, mobile Internet, artificial intelligence, etc. have driven the emergences of a lot of new phrases. In this paper, we propose a novel keyphrases extraction method with two steps by combining FP-growth algorithm and Latent Dirichlet Allocation (LDA) topic modeling. In the first step, we apply FP-growth algorithm to obtain frequent neighborhood words co-occurring frequently as candidate phrases. In the second step, we extract significant keyphrases by LDA models. Our experiments on two datasets CVE-2015 and 20-newsgroups have shown that the proposed approach can extract significant keyphrases and these phrases can help improve the text classification accuracy.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    12
    References
    0
    Citations
    NaN
    KQI
    []