EEC-Dedup: Efficient Erasure-Coded Deduplicated Backup Storage Systems

2017 
Modern backup storage systems adopt deduplication to save space by eliminating data duplicates whereas impairing the storage reliability, so many deduplicated backup storage systems apply erasure coding to post-deduplicate data for fault tolerance with the main goal of improving node repair for data loss. However, node repair may waste network bandwidth when unavailable nodes rejoin the system (e.g., power down or network outage). This paper aims for degraded read performance instead of node repair performance in erasure-coded deduplicated backup storage systems. We propose a coding scheme that operates on each object which is formed by packed deduplicated chunks (inner-object coding) rather than on multiple objects (inter-object coding), such that the storage overhead can be saved and the degraded read performance can be improved. We also leverage the rewriting algorithm to accelerate the recent backup read throughput. We build an erasure-coded deduplicated backup storage system prototype EEC-Dedup, which realizes inner-object coding scheme and the rewriting algorithm. Our experimental results based on real-world datasets show that EEC-Dedup improves the backup degraded read throughput by up to 430% in case of single node failure and saves the storage overhead by at most 28%, over the state-of-the-art.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    3
    Citations
    NaN
    KQI
    []