Accelerating Synchronization in Graph Analytics Using Moving Compute to Data Model on Tilera TILE-Gx72

2018 
The shared memory cache coherence paradigm is prevalent in modern multicores. However, as the number of cores increases, synchronization between threads limits performance scaling. Hardware-based core-to-core explicit messaging has been incorporated as an auxiliary communication capability to the shared memory cache coherence paradigm in the Tilera TILE-Gx72 multicore. We propose to utilize the auxiliary explicit messaging capability to build a moving computation to data model that accelerates synchronization using fine-grain serialization of critical code regions at dedicated cores. The proposed communication model exploits data locality and improves performance over both spin-lock and atomic instruction based synchronization methods for a set of parallelized graph analytic benchmarks executing on real world graphs. Experimental results show an average 34% better performance over spin-locks, and 15% over atomic instructions at 64 cores setup on TILE-Gx72.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    15
    References
    4
    Citations
    NaN
    KQI
    []