SInC: Semantic approach and enhancement for relational data compression

2022 
Data compression has been widely adopted in the industry to reduce storage or bandwidth consumption by removing redundant data or encoding information. Redundancy in semantics implies that some facts in a knowledge base can be inferred from the others. For relational databases, it is possible to remove records due to semantic equivalence. In this paper, we present a purely semantic approach, which losslessly compresses relational data in the first place and also enhances data file compression to further reduce the storage. Our Semantic Inductive Compressor () works not only for intra-relation patterns but also inter-relation cases. achieves around 1/3 to 2/3 of semantic compression ratios, and the original data can be entirely retrieved with the informative patterns induced by . We apply industrial data compression tools on semantically compressed databases, and the experiment results indicate an enhanced compression ratio up to 35%. Almost all efforts in our technique turn to the enhancement.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    0
    Citations
    NaN
    KQI
    []