Abstract 3378: SICILIAN: Precise and unbiased detection of gene fusions at the resolution of single cells using improved statistical modeling

2020 
Gene fusions are drivers in a multitude of hematological and solid tumors and hold great promise for developing therapeutic and diagnostic procedures in the clinic, e.g., BCR-ABL1 and TMPRSS2-ERG fusions in chronic myeloid leukemia and prostate cancers, respectively. Recently, our group has established computational evidence that rare and private gene fusions are un-appreciated drivers of 30% of tumors (Dehghannasiri et. al., 2019). However, the function for the vast majority of gene fusions remains unknown. In principle, single-cell RNA-Seq (scRNA-Seq) provides a method to determine the gene expression perturbations resulting from fusion expression. However, current computational methodology cannot precisely call gene fusions at the single-cell level mainly due to the small amount of transcriptomic information in each cell and substantial sequencing noise. To address these challenges, we introduce SIngle Cell precIse spLice estImAtioN (SICILIAN), a highly specific statistically driven fusion detection algorithm that implemented on top of traditional splice aligners. SICILIAN detect a diverse set of RNA splicing events, such as linear and circular RNAs and specifically gene fusions at annotated or un-annotated exonic boundaries. For detecting fusions, SICILIAN takes the spliced alignment information and employs a generalized linear model (GLM) based on alignment features from the alignment file. We use junctions categorized as likely TPs or likely FPs (via orthogonal measures) as training data. After training the model, SICILIAN assigns a statistical score to each fusion junction. Only considering fusions with high enough statistical scores can dramatically increase the precision of detection over typical detection strategies, such as filtering on the number of aligned reads or using ontology-level heuristic filters. Moreover, the assignment of statistical scores facilitates the application of false discovery rate control techniques using the statistical strength across thousands of single-cell samples to identify false positives due to multiple hypothesis testing. SICILIAN has a tunable statistical score for fusion calls and expands the scope of fusion detection to un-annotated exons and sequences while achieving high AUC performance on third-party simulated data. SICILIAN is currently being used to analyze massive single-cell samples from diverse tumor types for systematic profiling of fusions at the resolution of single-cells, an analysis made only possible through unprecedented precision and scale achieved by SICILIAN. In addition to its potential to reveal heterogeneity in tumor fusion expression, SICILIAN promises to enable new discovery of the function of fusion expression. Citation Format: Roozbeh Dehghannasiri, Julia Eve Olivieri, Julia Salzman. SICILIAN: Precise and unbiased detection of gene fusions at the resolution of single cells using improved statistical modeling [abstract]. In: Proceedings of the Annual Meeting of the American Association for Cancer Research 2020; 2020 Apr 27-28 and Jun 22-24. Philadelphia (PA): AACR; Cancer Res 2020;80(16 Suppl):Abstract nr 3378.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    0
    Citations
    NaN
    KQI
    []