HDAC3i-Finder: A Machine Learning-based Computational Tool to Screen for HDAC3 Inhibitors.

2020 
Histone deacetylase 3 (HDAC3) is a potential drug target for treatment of human diseases such as cancer and diabetes. Machine learning (ML) as an essential cheminformatics approach has been widely used for QSAR modeling. However, none of them has been applied to HDAC3. To this end, we carefully compiled a set of 1098 compounds from the ChEMBL database that have been assayed against HDAC3 and calculated three different sets of molecular features for each compound, i.e. two-dimensional Mordred descriptors, MACCS keys (166 bits) and Morgan2 fingerprints (1024 bits). Five ML classifiers, i.e. KNN, SVM, RF, XGBoost and DNN were trained on each feature set and optimized for classification.A total of 15 models were generated and carefully compared, among which the best-performing one was the XGBoost model based on the Morgan2 fingerprints, i.e. XGBoost_morgan2. Evaluated on a well-curated benchmarking set named MUBD-HDAC3, this model achieved a high early ROC enrichment (ROCE0.5%: 41.02). A further retrospective screening of an annotated chemical library in PubChem demonstrated that the best model could identify 8 novel-scaffold HDAC3 inhibitors while assaying only 1% of the compounds. To make this model accessible for the scientific community, we developed a python GUI application named HDAC3i-Finder to facilitate prospective screening for HDAC3 inhibitors. The source code of HDAC3i-Finder is available at https://github.com/jwxia2014/HDAC3i-Finder.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    30
    References
    3
    Citations
    NaN
    KQI
    []