Molecular discovery by optimal sequential search

2019 
In the development of a new compound in chemistry and molecular biology, especially a new medicine in pharmaceutical industry, we often need to find candidate(s), a molecule or molecules, with the best desired property (e.g., binding affinity in medicine) from a large set of molecules with the same scaffold but m distinct functional substitutes at each of its n different sites. The total number \(N_{\mathrm{lib}}\) of molecules in this library is \(m^n\). In some cases, \(N_{\mathrm{lib}}\) can be a very large number (e.g., millions). This is a challenging task because it is costly and often infeasible to synthesize and test all of these molecules. A new algorithm referred to as optimal sequential search is developed to overcome this difficulty. Especially, this algorithm is chemically intuitive which only uses the information of molecule composition, and accessible to practical chemists. The algorithm can be applied to small, medium and large size molecule libraries. With syntheses and property measurements for a limited number of molecules, the top best candidate molecules can be effectively captured from the whole library. Three examples with library size 64, 160,000 and 1,048,576, respectively, are used for illustration. For the first small library, syntheses and property measurements of 17 molecules are sufficient to capture the top 7 best candidate molecules; for the two medium and large libraries, syntheses and property measurements of about one thousand molecules can capture most or a large part of the top 500, especially the top 100 best candidate molecules. However, the algorithm needs to perform multiple (e.g., hundreds of) iterative syntheses and property measurements. The time cost may not be acceptable if the algorithm is performed manually. To make the algorithm practical, automation of the sequential searching process is the following task.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    19
    References
    0
    Citations
    NaN
    KQI
    []