Accelerating the Parallelization of Lattice Boltzmann Method by Exploiting the Temporal Locality

2017 
The lattice Boltzmann method (LBM) is a widely used solution for the computational fluid dynamics. It has the natural advantages of parallelization that several parallelizable nested loops in space dimensions are enclosed by the time steps. Most previous work focus on the DOALL parallelism and data reuse for cache optimization. This paper proposes an effective approach to accelerate the LBM computing by fully exploiting temporal locality on shared memory multicore platform. The approach uses loop transformations to obtain the implementation of D2Q9 model of LBM that achieves data reuses for cache in successive time steps and wave-front parallelism. Besides, a synchronization strategy based on POST/WAIT operations is presented to optimize the communication among parallel threads. On an Intel multicore platform, the implementation of the proposed approach outperformed the original parallel codes and a cache-based approach for different grid sizes, improving the performance by 11% and 9% on average, respectively.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    4
    Citations
    NaN
    KQI
    []