Representation of compounds for machine-learning prediction of physical properties

2017 
The representations of a compound, called "descriptors" or "features", play an essential role in constructing a machine-learning model of its physical properties. In this study, we adopt a procedure for generating a systematic set of descriptors from simple elemental and structural representations. First it is applied to a large dataset composed of the cohesive energy for about 18000 compounds computed by density functional theory (DFT) calculation. As a result, we obtain a kernel ridge prediction model with a prediction error of 0.041 eV/atom, which is close to the "chemical accuracy" of 1 kcal/mol (0.043 eV/atom). The procedure is also applied to two smaller datasets, i.e., a dataset of the lattice thermal conductivity (LTC) for 110 compounds computed by DFT calculation and a dataset of the experimental melting temperature for 248 compounds. We examine the performance of the descriptor sets on the efficiency of Bayesian optimization in addition to the accuracy of the kernel ridge regression models. They exhibit good predictive performances.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    61
    References
    180
    Citations
    NaN
    KQI
    []