Prediction of Hormone-Binding Proteins Based on K-mer Feature Representation and Naive Bayes

2021 
Hormone binding protein (HBP) is a soluble carrier protein that interacts selectively with different types of hormones and has various effects on the body’s life activities. HBPs play an important role in the growth process of organisms, but their specific role is still unclear. Therefore, correctly identifying HBPs is the first step towards understanding and studying their biological function. However, due to their high cost and long experimental period, it is difficult for traditional biochemical experiments to correctly identify HBPs from an increasing number of proteins, so the real characterization of HBP has become a challenging task for researchers. To measure the effectiveness of HBP in humans, an accurate and reliable prediction model for their identification is desirable. In this paper, we construct the prediction model HBP_NB. First, HBP data were collected from the UniProt database, and a dataset was established. Then, based on the established high-quality dataset, the k-mer (K=3) feature representation method was used to extract features. Second, the feature selection algorithm was used to reduce the dimensionality of the extracted features and select the appropriate optimal feature set. Finally, the selected features were input into the prediction model as input vectors for prediction, and 10-fold cross validation was used to evaluate the HBP_NB model. The rigorous nested 10-fold cross validation, demonstrated that our method obtained the pre-diction accuracy of 95.447%, sensitivity of 94.167% and specificity of 96.732%. These results indicate that our model is feasible and effective.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    64
    References
    0
    Citations
    NaN
    KQI
    []