Multi-Label Classification Based On Subcellular Region-Guided Feature Description For Protein Localisation.
2021
In this paper, we present a multi-label classification pipeline and a novel feature descriptor for the protein subcellular localisation. The challenge here is the development of a computational model that can classify multi-site proteins on a highly imbalanced dataset with a long-tail distribution and multi-label images. To address this challenge, we design a Location-Sorted Random Projections feature descriptor to represent image intensity and gradient of the protein of interest in reference to the correlated cellular region. Multilabel Synthetic Minority Over-sampling Technique is optimised to generate synthetic features with labels to handle class imbalance. Our method achieves the state-of-the-art performance on a large-scale public dataset and demonstrates excellent performance for the minority classes.
- Correction
- Source
- Cite
- Save
- Machine Reading By IdeaReader
12
References
0
Citations
NaN
KQI