Improving Coding Efficiency of MPEG-G Standard Using Context-Based Arithmetic Coding

2018 
The rapid decreasing of DNA sequencing costs, together with the need for exponentially more DNA sequencing in biomedical research, drug discovery, and clinical genomics has led to a burst in the volume of genomic data. The amount of genomic data is now growing at a rate that already far outpaced Moore’s Law. Storage and transmission of genomic data has become a new bottleneck for sequencing applications. In response to this need, the ISO/IEC JTC 1/SC 29/WG 11 (MPEG) standard organization started a new ISO standard project, namely, the MPEG-G standard, for efficient genomic information representation and compression. In this paper, we present an improved method for lossless compression of nucleobase quality value, one of the most challenging parts from genomic data for data compression due to its high entropy. The proposed method has been adopted by MPEG-G as the normative lossless coding method for quality value due to its good performance, and its compatibility with the context-adaptive binary arithmetic coding (CABAC) entropy coding engine currently used in the MPEG-G framework.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    29
    References
    1
    Citations
    NaN
    KQI
    []