Convex hull principle for classification and phylogeny of eukaryotic proteins

2018 
Abstract This study quantitatively validates the principle that the biological properties associated with a given genotype are determined by the distribution of amino acids. In order to visualize this central law of molecular biology, each protein was represented by a point in 250-dimensional space based on its amino acid distribution. Proteins from the same family are found to cluster together, leading to the principle that the convex hull surrounding protein points from the same family do not intersect with the convex hulls of other protein families. This principle was verified computationally for all available and reliable protein kinases and human proteins. In addition, we generated 2,328,761 figures to show that the convex hulls of different families were disjoint from each other. The classification performs well with high and robust accuracy (95.75% and 97.5%) together with reasonable phylogenetic trees validate our methods further.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    33
    References
    6
    Citations
    NaN
    KQI
    []