Use and validation of text mining and cluster algorithms to derive insights from Corona Virus Disease-2019 (COVID-19) medical literature

2021 
Abstract The emergence of the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) late last year has not only led to the world-wide coronavirus disease 2019 (COVID-19) pandemic but also a deluge of biomedical literature. Following the release of the COVID-19 open research dataset (CORD-19) comprising over 200,000 scholarly articles, we a multi-disciplinary team of data scientists, clinicians, medical researchers and software engineers developed an innovative natural language processing (NLP) platform that combines an advanced search engine with a biomedical named entity recognition extraction package. In particular, the platform was developed to extract information relating to clinical risk factors for COVID-19 by presenting the results in a cluster format to support knowledge discovery. Here we describe the principles behind the development, the model and the results we obtained.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    20
    References
    0
    Citations
    NaN
    KQI
    []