Performance evaluation of hybrid programming patterns for large CPU/GPU heterogeneous clusters

2012 
Abstract The CPU/GPU heterogeneous clusters are important platforms for high performance computing applications. However, there are many challenges for efficiently performing the scientific and engineering legacy code on these heterogeneous systems. In this paper, we endeavor to address the programming-model issue by combining the existing models (i.e., MPI, OpenMP and CUDA). First, two hybrid programming patterns are presented, namely the MPI + CUDA and MPI + OpenMP/CUDA . Second, three kernels (i.e., EP, CG and MG) of the NAS parallel benchmarks (NPBs), which are abstracted from many legacy computational fluid dynamics applications, are implemented with the above two patterns. Third, these hybrid implementations are executed on the TianHe-1A supercomputer, and the corresponding experimental results show that significant performance improvement can be achieved with the above patterns. Finally, a detailed performance analysis about the two hybrid patterns is performed and some guidelines for porting the legacy code onto large-scale heterogeneous CPU/GPU clusters are also given.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    28
    References
    24
    Citations
    NaN
    KQI
    []