Biomedical Named-Entity Recognition by Hierarchically Fusing BioBERT Representations and Deep Contextual-Level Word-Embedding

2020 
Text mining in the biomedical domain is increasingly important as the volume of biomedical documents increases. Thanks to advances in natural language processing (NLP), extracting valuable information from the biomedical literature is gaining popularity among researchers, and deep learning has enabled the development of effective biomedical text mining models. However, directly applying advancements in NLP to biomedical sources often yields unsatisfactory results, due to a word distribution drift from the general language domain corpora to specific biomedical corpora, and this drift introduces linguistic ambiguities. To overcome these challenges, this paper presents a novel method for biomedical named entity-recognition (BioNER) through hierarchically fusing representations from BioBERT, which is trained on biomedical corpora and Deep contextual-level word embeddings to handle the linguistic challenges within biomedical literature. Proposed text representation is then fed to attention-based Bi-directional Long Short Term Memory (BiLSTM) with Conditional random field (CRF) for the BioNER task. The experimental analysis shows that our proposed end-to-end methodology outperforms existing state-of-the-art methods for the BioNER task.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    48
    References
    10
    Citations
    NaN
    KQI
    []