Sentence representation with manifold learning for biomedical texts

2021 
Abstract Sentence representation approaches based on deep learning have become a major part of natural language processing, and pretrained sentences have wide applications in biomedical texts. However, the geometric basis of sentence representations has not yet been carefully studied in biomedical texts. In this paper, we focus on exploiting the geometric structure of sentences to improve the biomedical text presentation effect. To mine the geometric structure information from sentence representations, we introduce manifold learning, which brings the similarity of sentences in Euclidean space closer to the sentence semantics, into biomedical sentence representations. First, we use the pretrained sentence representation method to obtain a representation of a biomedical text sentence and then use manifold learning to construct the adjacency graph structure of the sentence representation to characterize the local geometric structure information of the sentence representations, thus revealing the essential laws among the sentences. Through the manifold method, we can describe the potential relations among sentences, thus improving the effect based on downstream biomedical text tasks. Our sentence representation method was evaluated on biomedical text tasks. The experimental results show that our model achieved better results than several normal sentence representation methods.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    51
    References
    3
    Citations
    NaN
    KQI
    []