Cluster Labeling Extraction and Ranking Feature Selection for High Quality XML Pseudo Relevance Feedback Fragments Set

2013 
In traditional pseudo feedback, the main reason of the topic drift is the low quality of the feedback source. Clustering search results is an effective way to improve the quality of feedback set. For XML data, how to effectively perform clustering algorithm and then identify good xml fragments from the clustering results is a intricate problem. This paper mainly focus on the latter problem. Based on k-mediod clustering results, This work firstly proposes an cluster label extraction method to select candidate relevant clusters. Secondly, multiple ranking features are introduced to assist the related xml fragments identification from the candidate clusters. Top N fragments compose the high quality pseudo feedback set finally. Experimental results on standard INEX test data show that in one hand, the proposed cluster label extraction method could obtain proper cluster key terms and lead to appropriate candidate cluster selection. On the other hand, the presented ranking features are beneficial to the relevant xml fragments identification. The quality of feedback set is ensured.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    9
    References
    1
    Citations
    NaN
    KQI
    []