Understanding the Query: THCIB and THUIS at NTCIR-10 Intent Task.

2013 
Understanding intent underlying search query recently attracted enormous research interests. Two challenging issues are worth noting: First, words within query are usually ambiguous while query in most cases is too short to disambiguate. Second, ambiguity in some cases cannot be resolved according merely to the limited query context. It is thus demanded that the ambiguity be resolved/analyzed within context other than the query itself. This paper presents the intent mining system developed by THCIB and THUIS, which is capable of understanding English and Chinese query respectively, with four types of context: query, knowledge base, search results and user behavior statistics. The major contributions are summarized as follows: (1) Extracted from the query, concepts are used to extend the query; (2) Concepts are used to extract explicit subtopic candidates within Wikipedia. (3) LDA is applied to discover explicit subtopic candidates within search results. (4) Sense based subtopic clustering and entity analysis are conducted to cluster the subtopic candidates so as to discover the exclusive intents. (5) Intents are ranked with a unified intent ranking model. Experimental results indicate that our intent mining method is effective.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    31
    References
    7
    Citations
    NaN
    KQI
    []