Optimizing the hybrid parallelization of BHAC

Salvatore Cielo,Oliver Porth,Luigi Iapichino,Anupam Karmakar,Héctor Olivares,Chun Xia

Optimizing the hybrid parallelization of BHAC

2021

We present our experience with the modernization on the GR-MHD code BHAC, aimed at improving its novel hybrid (MPI+OpenMP) parallelization scheme. In doing so, we showcase the use of performance profiling tools usable on x86 (Intel-based) architectures. Our performance characterization and threading analysis provided guidance in improving the concurrency and thus the efficiency of the OpenMP parallel regions. We assess scaling and communication patterns in order to identify and alleviate MPI bottlenecks, with both runtime switches and precise code interventions. The performance of optimized version of BHAC improved by $\sim28\%$, making it viable for scaling on several hundreds of supercomputer nodes. We finally test whether porting such optimizations to different hardware is likewise beneficial on the new architecture by running on ARM A64FX vector nodes.

Keywords:

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations