leADS: improved metabolic pathway inference based on active dataset subsampling
2021
Metabolic pathways are composed of reaction sequences catalyzed by enzymes. The set of reactions within and between cells comprises a reactome. Pathways and reactomes can be predicted from organismal or multi-organismal genomes using rule-based or machine learning methods. While machine learning methods overcome issues of probability and scale associated with rule-based methods, several complications remain that can degrade performance including inadequately labeled training data, missing feature information, and inherent imbalances in the distribution of pathways within a dataset. Here, we present leADS (multi-label learning based on active dataset subsampling), a machine learning method, that uses subsampling to reduce the negative impact of training loss due to class imbalance. We demonstrate leADs performance using organismal and multi-organismal datasets in relation to other machine learning pathway prediction methods.
Keywords:
- Correction
- Source
- Cite
- Save
- Machine Reading By IdeaReader
32
References
2
Citations
NaN
KQI