Evaluation of Disease-Associated Text-Mining Databases
2015
There are about 20 million scientific articles in PubMed and this is a great source of knowledge. Extraction of information from the articles is one of challenges in biology and thus many text-mining approaches have been developed. However, the accuracy of text-mined results is still in question. Here we evaluated three text-mining databases with genes associated with Alzheimer's disease. Their per-gene accuracy is high (57-100%), but their per-abstract accuracy is relatively low (33-64%). This represents that the association of gene and disease is well-identified when abundant articles are available. However, genes with fewer articles could be wrongfully identified associated. Consequently, human-curation is still complementary to current text-mining approaches and future text-mining methods should improve their accuracy for genes with few articles or information.
- Correction
- Source
- Cite
- Save
- Machine Reading By IdeaReader
0
References
0
Citations
NaN
KQI