Taking Heuristic Based Graph Edge Partitioning One Step Ahead via OffStream Partitioning Approach

2021 
In the modern era of big data, large-scale graph computing has become challenging because of the dramatic rise in graph data size. Graph edge partitioning (GEP) is a crucial preprocessing step to distributed graph platforms, yet it is challenging to partition the large-scale graphs. GEP has shown better partition quality than the graph vertex partitioning for the graph’s skewed degree distribution. Existing GEP approaches are classified into two as stream and offline. The former category assigns edges to the partitions based on the previously received edge information. It has less partitioning quality and is affected by stream order compared to the latter while supporting big graph partitioning. The latter uses complete knowledge of a graph during partitioning and hence has a better partitioning quality than the former; however, it does not support large-scale graphs. In this study, we propose a novel OffStream partitioning approach (OSPA) and hybrid graph edge partitioner OffStreamNH. OSPA leverages both the offline and stream graph partitioning approaches through stateful partitioning by introducing a state layer. This stateful partition state is recorded while offline is partitioning its input graph. It contains partial knowledge of previously partitioned data and is used by the stream partitioner. The OffStreamNH uses Neighborhood Expansion (NE) and Higher Degree Replicated First (HDRF) algorithms for the offline and online; respectively, with minor modifications of both algorithms. Experimental results show that OffStreamNH outperforms the state of the art stream partitioners in terms of replication factor, load balance and tolerates the effect of stream orders.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    18
    References
    0
    Citations
    NaN
    KQI
    []