Haplotype-resolved and integrated genome analysis of the cancer cell line HepG2

2019 
SUMMARY: The HepG2 cancer cell line is one of the most widely-used biomedical research and one of the main cell lines of ENCODE. Vast numbers of functional genomics and epigenomics datasets have been produced to characterize its biology. However, the correct interpretation such data requires an understanding of the cell line9s genome sequence and genome structure. Using a variety of sequencing and analysis methods, we identified a wide spectrum of HepG2 genome characteristics: copy numbers of chromosomal segments, SNVs and Indels (corrected for aneuploidy), phased haplotypes extending to entire chromosome arms, loss of heterozygosity, retrotransposon insertions, structural variants (SVs) including complex and somatic genomic rearrangements. We also identified allele-specific expression and DNA methylation genome-wide and assembled an allele-specific CRISPR/Cas9 targeting map. SIGNIFICANCE: Haplotype-resolved and comprehensive whole-genome analysis of a widely-used cell line for cancer research and ENCODE, HepG2, serves as an essential resource for unlocking complex cancer gene regulation using a genome-integrated framework and also provides genomic context for the analysis of ~1,000 functional datasets to date on ENCODE for biological discovery. We also demonstrate how deeper insights into genomic regulatory complexity are gained by adopting a genome-integrated framework.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    101
    References
    3
    Citations
    NaN
    KQI
    []