Generation and initial analysis of more than 15,000 full-length human and mouse cDNA sequences.

2002 
The gene content of the mammalian genome is a topic of great interest. While draft sequences are now available for the human (1, 2), mouse (www.ensembl.orgMusmusculus), and rat (http:hgsc.bcm.tmc.eduprojectsrat) genomes, the challenge remains to correctly identify all of the encoded genes. Difficulty in deciphering the anatomy of mammalian genes is due to several factors, including large amounts of intervening (noncoding) sequence, the imperfection of gene-prediction algorithms (3), and the incompleteness of cDNA-sequence resources, many of which consist of gene tags of variable length and quality. Full-length cDNA sequences are extremely useful for determining the genomic structure of genes, especially when analyzed within the context of genomic sequence. To facilitate geneidentification efforts and to catalyze experimental investigation, the National Institutes of Health (NIH) launched the Mammalian Gene Collection (MGC) program (4) with the aim of providing freely accessible, high-quality sequences for validated, complete ORF cDNA clones. In this article, we describe our progress toward the goal of identifying and accurately sequencing at least one full ORF-containing cDNA clone for each human and mouse gene, as well as making these fully sequenced clones available without restriction. Materials and Methods cDNA Library Production. MGC cDNA libraries were prepared from a diverse set of tissues and cell lines, in several different vector systems, by using a variety of methods. Vector maps and details of library construction are available at http:mgc. nci.nih.govInfoVectorMaps. The complete sequences for each of the MGC vectors can be found at http:image.llnl.gov imagehtmlvectors.shtml. The catalog of MGC cDNA libraries can be accessed at http:mgc.nci.nih.gov.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    21
    References
    1585
    Citations
    NaN
    KQI
    []