A New Combinatorial Coded Design for Heterogeneous Distributed Computing

2021 
Coded Distributed Computing (CDC) introduced by Li et al. in 2015 offers an efficient approach to trade computing power to reduce the communication load in general distributed computing frameworks such as MapReduce and Spark. In particular, increasing the computation load in the Map phase by a factor of $r$ can create coded multicasting opportunities to reduce the communication load in the Shuffle phase by the same factor. However, the CDC scheme is designed for the homogeneous settings, where each node maps the same number of files and is assigned the same number of reduce functions. It requires an exponentially large number of input files (data batches), reduce functions and multicasting groups relative to the number of nodes to achieve the promised gain. We address the CDC limitations by proposing a novel CDC approach based on a combinatorial design, which accommodates heterogeneous networks and maintains a multiplicative computation-communication trade-off. In addition, the proposed approach requires an exponentially less number of input files compared to the original CDC scheme proposed by Li et al. Finally, we derive a new information theoretic converse for general heterogeneous CDC and show that the communication load of the proposed design is optimal within a constant factor.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    39
    References
    1
    Citations
    NaN
    KQI
    []