RNA-seq analysis of the C. briggsae transcriptome

2012 
Curation of a high-quality gene set is the critical first step in genome research, enabling subsequent analyses such as ortholog assignment, cis-regulatory element finding, and synteny detection. In this project, we have reannotated the genome of Caenorhabditis briggsae, the best studied sister species of the model organism Caenorhabditis elegans. First, we applied a homology-based gene predictor genBlastG to annotate the C. briggsae genome. We then validated and further improved the C. briggsae gene annotation through RNA-seq analysis of the C. briggsae transcriptome, which resulted in the first validated C. briggsae gene set (23,159 genes), among which 7347 genes (33.9% of all genes with introns) have all of their introns confirmed. Most genes (14,812, or 68.3%) have at least one intron validated, compared with only 3.9% in the most recent WormBase release (WS228). Of all introns in the revised gene set (103,083), 61,503 (60.1%) have been confirmed. Additionally, we have identified numerous trans-splicing leaders (SL1 and SL2 variants) in C. briggsae, leading to the first genome-wide annotation of operons in C. briggsae (1105 operons). The majority of the annotated operons (564, or 51.0%) are perfectly conserved in C. elegans, with an additional 345 operons (or 31.2%) somewhat divergent. Additionally, RNA-seq analysis revealed over 10 thousand small-size assembly errors in the current C. briggsae reference genome that can be readily corrected. The revised C. briggsae genome annotation represents a solid platform for comparative genomics analysis and evolutionary studies of Caenorhabditis species.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    67
    References
    24
    Citations
    NaN
    KQI
    []