Protein-protein interactions decoys datasets for machine learning algorithm development
2021
This is the most complete and diverse protein docking decoys set derived from the Benchmark5, Scorers_set. We used three different rigid-body docking programs to generate the decoys for the Bechmark5. We analyzed all docking decoys with more than 150 different scoring functions from different sources ( CCharppi, FreeSASA, CIPS, CONSRANK). We provide a balanced and unbalanced version of the data. This balanced data is intended for the training and test of machine learning algorithms. the unbalanced data is provided to simulated the real-world scenario. We also provide a set of rigid-body docking decoys from Interactome3D that spans 1391 interactions. We obtained the labels for this set using a weakly-supervised approach we called hAIkal. We used this data to augment the train data and improve machine learning classifiers.
Keywords:
- Correction
- Source
- Cite
- Save
- Machine Reading By IdeaReader
0
References
0
Citations
NaN
KQI