Accelerating Radiative Transfer Simulation with GPU-FPGA Cooperative Computation

2020 
Field-programmable gate arrays (FPGAs) have garnered significant interest in research on high-performance computing. This is ascribed to the drastic improvement in their computational and communication capabilities in recent years owing to advances in semiconductor integration technologies that rely on Moore’s Law. In addition to these performance improvements, toolchains for the development of FPGAs in OpenCL have been offered by FPGA vendors to reduce the programming effort required. These improvements suggest the possibility of implementing the concept of enabling on-the-fly offloading computation at which CPUs/GPUs perform poorly relative to FPGAs while performing low-latency data transfers. We consider this concept to be of key importance to improve the performance of heterogeneous supercomputers that employ accelerators such as a GPU. In this study, we propose GPU–FPGA-accelerated simulation based on this concept and demonstrate the implementation of the proposed method with CUDA and OpenCL mixed programming. The experimental results showed that our proposed method can increase the performance by up to $17.4 \times$ compared with GPU-based implementation. This performance is still $1.32 \times$ higher even when solving problems with the largest size, which is the fastest problem size for GPU-based implementation. We consider the realization of GPU–FPGA-accelerated simulation to be the most significant difference between our work and previous studies.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    15
    References
    1
    Citations
    NaN
    KQI
    []