A customized GPU acceleration of the princeton ocean model

2014 
While GPU is becoming a compelling acceleration solution for a series of scientific applications, most existing work on climate models only achieved limited speedup. This is due to partial porting of the huge code and the memory bound inherence of these models. In this work, we design and implement a customized GPU-based acceleration of the Princeton Ocean Model (gpuPOM) based on mpiPOM, which is one of the parallel versions of the Princeton Ocean Model. Based on Nvidia's state-of-the-art GPU architectures (K20X and K40m), we rewrite the full mpiPOM model from the original Fortran version into the CUDA-C version. We present the GPU acceleration methods used in the gpuPOM, especially the techniques to ease its memory bound problem through better use of GPU's memory hierarchy. The experimental results indicate that the gpuPOM with one K40m GPU achieves from 6.3-fold to 16.7-fold speedup over different Intel multi-core CPUs and one K20X GPU achieves from 5.8-fold to 15.5-fold speedup.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    2
    References
    3
    Citations
    NaN
    KQI
    []