Achieving Optimal Inter-Node Communication in Graph Partitioning Using Random Selection and Breadth-First Search

2016 
Processing large graph datasets represents an increasingly important area in computing research and applications. The size of many graph datasets has increased well beyond the processing capacity of a single computing node, thereby necessitating distributed approaches. As these datasets are processed over a distributed system of nodes, this leads to an inter-node communication cost problem that negatively affects system performance. Previously proposed algorithms implemented breadth-first search (BFS) for graph searching and focused on the execution, parallel performance and not the communication. In this paper a new methodology is proposed that combines BFS with random selection in order to partition large graph datasets and effectively minimize inter-node communication. The new method is discussed and applied to the single-source shortest path and PageRank algorithms using three graphs that are representative of real-world scenarios. Experimental results show that graph inter-node communication for canonical graphs representative of real-world data is improved up to 42 % in case of Powerlaw graph, up to 27 % in case of Random near K-regular graph (with low degree), and up to 7 % in case of Random near K-regular graph (with high degree).
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    48
    References
    1
    Citations
    NaN
    KQI
    []