Co-HITS-Ranking Based Query-Focused Multi-document Summarization

2010 
Graph-based ranking methods have been successfully applied to multi-document summarization by adopting various link analysis algorithms such as PageRank and HITS to incorporate diverse relationships into the process of sentence evaluation. Both the homogeneous relationships between sentences and the heterogeneous relationships between sentences and documents have been investigated in the past. However, for query-focused multi-document summarization, the other three kinds of relationships (i.e. the relationships between documents, the relationships between the given query and documents, and the sentence-to-document correlation strength) are seldom considered when computing the sentence’s importance. In order to address the limitations, this study proposes a novel Co-HITS-Ranking based approach to query-biased summarization, which can fuse all of the above relationships, either homogeneous or heterogeneous, in a unified two-layer graph model with the assumption that significant sentences and significant documents can be self boosted and mutually boosted. In the model, the manifold-ranking algorithm is employed to assign the initial biased information richness scores for sentences and documents individually only based on the local recommendations between homogeneous objects. Then by adopting the Co-HITS-Ranking algorithm, the initial biased information richness scores of sentences and documents are naturally incorporated in a mutual reinforcement framework to co-rank heterogeneous objects jointly. The final score of each sentence can be obtained through an iteratively updating process. Experimental results on the DUC datasets demonstrate the good effectiveness of the proposed approach.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    13
    References
    3
    Citations
    NaN
    KQI
    []