Two-level text similarity calculation method based on subjective and objective semantics

2013 
A two-level text similarity calculation method based on subjective and objective semantics is characterized in that text is divided into a topic and a main body, a topic-word vector is built by filtering, a main body-word vector with low dimensionality is built by extracting keywords, a word semantic similarity calculation method achieving subjective and objective combination is used for calculating word vector similarity so as to obtain the topic similarity and the main body similarity respectively, and therefore the text similarity is obtained; the word semantic similarity is calculated on the basis of word-text indexes of HowNet and a corpus, so that words are expressed concisely, and calculation results accord with not only subjective concepts but also objective semantic environments; during calculation of the text similarity, equal importance is attached to the topic and the main body, the word semantic similarity calculation method achieving subjective and objective combination is used, a text-word vector with high dimensionality is avoided, text information is extracted fully, accuracy of text similarity results is improved, and the two-level text similarity calculation method is suitable for text similarity analysis under various circumstances.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    1
    References
    0
    Citations
    NaN
    KQI
    []