Discovery of novel conotoxin candidates using machine learning

2018 
Cone snails (genus ) are venomous marine snails that inject prey with a lethal cocktail of conotoxins, small, secreted, and cysteine-rich peptides. Given the diversity and often high affinity for their molecular targets, consisting of ion channels, receptors or transporters, many conotoxins have become invaluable pharmacological probes, drug leads, and therapeutics. Transcriptome sequencing of venom glands followed by de novo assembly and homology-based toxin identification and annotation is currently the state-of-the-art for discovery of new conotoxins. However, homology-based search techniques, by definition, can only detect novel toxins that are homologous to previously reported conotoxins. To overcome these obstacles for discovery, we have created Pipe, a machine learning tool that utilizes prominent chemical characters of conotoxins to predict whether a certain transcript in a transcriptome, which has no otherwise detectable homologs in current reference databases, is a putative conotoxin. By using Pipe on RNASeq data of 10 species, we report 5148 new putative conotoxin transcripts that have no homologues in current reference databases. 896 of these were identified by at least three out of four models used. These data significantly expand current publicly available conotoxin datasets and our approach provides a new computational avenue for the discovery of novel toxin families.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    36
    References
    10
    Citations
    NaN
    KQI
    []