Chromosome-level genome assembly of the East Asian common octopus (Octopus sinensis) using PacBio sequencing and Hi-C technology.

2020 
The Cephalopods are a group of highly diverse marine species in the phylum Mollusca, which are distributed worldwide. They have evolved some vertebrate-like biological traits and exhibit complicated behavioral repertoires. Thus, they are interesting species for studying the mechanisms of evolutionary convergence, innovational functional structures, and evolutionary adaptation to a highly active, predatory lifestyle in diverse marine environments. Despite the evolutionary placement and biological significance of cephalopods, genomic data on these organisms remain limited. Here, we assembled a chromosome-level genome of a female East Asian common octopus (Octopus sinensis) by combining Pacific Bioscience (PacBio) single-molecule real-time sequencing, Illumina paired-end sequencing and Hi-C technology. An O. sinensis genome of 2.72 Gb was assembled from a total of 245.01 Gb high-quality PacBio sequences. The assembled genome represents 80.2% completeness (BUSCO) with a contig N50 of 490.36 Kb and a scaffold N50 of 105.89 Mb, showing a considerable improvement compared to other sequenced cephalopod genomes. Hi-C scaffolding of the genome resulted in the construction of 30 pseudo-chromosomes in Cephalopoda, representing 96.41% of the assembled sequences. The genome contained 42.26% repeat sequences and 5,245 noncoding RNAs. A total of 31,676 protein-coding genes were predicted, of which 82.73% were functionally annotated. Comparative genomic analysis identified 17,020 orthologous gene families, including 819 unique gene families and 629 expanded gene families. This genomic information will be an important molecular resource for further investigation of biological function and evolutionary adaptations in octopuses, and facilitate research into their population genetics and comparative evolution.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    70
    References
    6
    Citations
    NaN
    KQI
    []