Revealing Unknown Controlled Substances and New Psychoactive Substances Using High-Resolution LC-MS/MS Machine Learning Models and the Hybrid Similarity Search Algorithm.

2021 
High-resolution LC-MS/MS tandem mass spectra-based machine learning models are constructed to address the analytical challenge of identifying unknown controlled substances and new psychoactive substances (NPS's). Using a training set comprised of 770 LC-MS/MS barcode spectra (with binary entries 0 or 1) obtained generally by high-resolution mass spectrometers, three classification machine learning models were generated and evaluated. The three models are artificial neural network (ANN), support vector machine (SVM), and k-nearest neighbor (k-NN) models. In these models, controlled substances and NPS's were classified into 13 subgroups (benzylpiperazine, opiate, benzodiazepine, amphetamine, cocaine, methcathinone, classical cannabinoid, fentanyl, 2C series, indazole carbonyl compound, indole carbonyl compound, phencyclidine, and others). Using 193 LC-MS/MS barcode spectra as an external test set, accuracy of the ANN, SVM, and k-NN models were evaluated as 72.5%, 90.0%, and 94.3%, respectively. Also, the hybrid similarity search (HSS) algorithm was evaluated to examine whether this algorithm can successfully identify unknown controlled substances and NPS's whose data are unavailable in the database. When only 24 representative LC-MS/MS spectra of controlled substances and NPS's were selectively included in the database, it was found that HSS can successfully identify compounds with high reliability. The machine learning models and HSS algorithms are incorporated into our home-coded AI-SNPS (artificial intelligence screener for narcotic drugs and psychotropic substances) standalone software that is equipped with a graphic user interface. The use of this software allows unknown controlled substances and NPS's to be identified in a convenient manner.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    46
    References
    0
    Citations
    NaN
    KQI
    []