Self-supervised molecular pretraining strategy for low-resource reaction prediction scenarios

Chengyun Zhang,Xiang Cai,Haoran Qiao,Yun Zhang,Yejian Wu,Xinqiao Wang,Haiying Xie,Feng Luo,Hongliang Duan

Self-supervised molecular pretraining strategy for low-resource reaction prediction scenarios

2021

Chengyun Zhang
Xiang Cai
Haoran Qiao
Yun Zhang
Yejian Wu
Xinqiao Wang
Haiying Xie
Feng Luo
Hongliang Duan

In the face of low-resource reaction training samples, we construct a chemical platform for addressing small-scale reaction prediction problem. By using a self-supervised pretraining strategy called MASS, the transformer model can absorb the chemical information about 1 billion molecules and then finetunes on small-scale reaction prediction, which is different from previous works that only rely on reaction samples. To demonstrate the broad applicability of our approach, we adopt three dif-ferent name reactions in our work. In the Baeyer-Villiger, Heck and Sharpless asymmetric epoxidation reaction prediction tasks, the average accuracies increase by 5.7%, 10.8%, 4.8% respectively, marking an important step to low-resource reaction prediction.

Keywords:

low resource
transformer
Deep learning
Artificial intelligence
construct
Machine learning
Face (geometry)
Computer science

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations