Unlabeled Short Text Similarity With LSTM Encoder

2019 
Short texts play an important role in our daily communication. It has been applied in many fields. In this paper, we propose a novel short text similarity measurement algorithm-based long short-term memory (LSTM) encoder. It contains preprocessing, training, and evaluating stages. Our preprocessing algorithm can avoid gradient vanishing problems in the process of backward propagation faster after normalization. The training stage fully leverages the inception module to extract the features of different dimensions and improves the LSTM network to process the relationships of word sequences. The evaluating stage employs cosine distance to calculate the semantic similarity of two short texts. We do experiments on two short text dataset of different lengths and analyze the experiment result. The experiment result shows that our algorithm can fully employ semantic information and sequence information of short texts and have a higher accuracy and recall compared to other short text similarity measurement algorithms.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    10
    Citations
    NaN
    KQI
    []