Abstract 4859: Tumor and normal classification of formalin-fixed, paraffin-embedded (FFPE) specimens by transcriptome RNA-seq

2011 
We have used RNA-seq to profile and compare normal and cancerous human breast tissue. FFPE breast specimens from a total of 24 patients, 12 normal (N) and 12 tumor (T) specimens from surgical resections, were analyzed on an Illumina9s GA IIx sequencer. Whole transcriptome RNA-Seq libraries were prepared after depletion of ribosomal RNA by a protocol developed at Genomic Health Inc. (GHI). The analysis was multiplexed across two flow cells using barcoding, with two specimens per sequencing lane (1 T and 1 closely age-matched N library from a different patient). FFPE tissue archive times ranged from 10 to 13 years and they were also closely matched within each lane. To evaluate reproducibility, triplicate libraries were created from 4 of the specimens and analyzed within and across flow cells. Libraries yielded, on average, 19 million 51 bp sequences. R 2 values obtained from replicate libraries prepared from the same patient RNA were > 0.9 within and between flow cells. More than 80% of known genes in the human genome were detected in all patients. Several thousand intergenic transcripts were identified by an algorithm developed at GHI. A negative binomial model with tag-wise estimates of dispersion was applied to the known genes and intergenic regions. Inter-patient count variance is generally higher in the set of intergenic sequences than in the set of gene (RefSeq) sequences. Thousands of gene (RefSeq) and intergenic sequences were found to be differentially expressed between T and N tissues. We sought to build classifiers based on flow cell #1 data that could stratify T and N tissues when applied to flow cell #2 data. Sets of genes and intergenic regions were selected for analysis based on high inter-patient count variance. Support vector machine classifiers were trained and then applied to the data from flow cell #2, and also to another GHI tumor/normal RNA-Seq study. Either a set of 100 genes (RefSeq), or a set of 70 intergenic sequences accurately distinguished tumor and normal tissues. Our results offer further evidence of the potential of RNA-Seq for discovery of biomarkers. Citation Format: {Authors}. {Abstract title} [abstract]. In: Proceedings of the 102nd Annual Meeting of the American Association for Cancer Research; 2011 Apr 2-6; Orlando, FL. Philadelphia (PA): AACR; Cancer Res 2011;71(8 Suppl):Abstract nr 4859. doi:10.1158/1538-7445.AM2011-4859
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    0
    Citations
    NaN
    KQI
    []