Population-scale long-read sequencing uncovers transposable elements contributing to gene expression variation and associated with adaptive signatures in Drosophila melanogaster

2021 
ABSTRACT High quality reference genomes are crucial to understanding genome function, structure and evolution. The availability of reference genomes has allowed us to start inferring the role of genetic variation in biology, disease, and biodiversity conservation. However, analyses across organisms demonstrate that a single reference genome is not enough to capture the global genetic diversity present in populations. In this work, we generated 32 high-quality reference genomes for the well-known model species D. melanogaster and focused on the identification and analysis of transposable element variation as they are the most common type of structural variant. We showed that integrating the genetic variation across natural populations from five climatic regions increases the number of detected insertions by 58%. Moreover, 26% to 57% of the insertions identified using long-reads were missed by short-reads methods. We also identified hundreds of transposable elements associated with gene expression variation and new TE variants likely to contribute to adaptive evolution in this species. Our results highlight the importance of incorporating the genetic variation present in natural populations to genomic studies, which is essential if we are to understand how genomes function and evolve.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    108
    References
    0
    Citations
    NaN
    KQI
    []