Draft: sOMP: NUMA and cache-aware simulations for task-based applications

2021 
Anticipating the behavior of applications, studying, and designing algorithms are some of the most important purposes for the performance and correction studies about simulations and applications relating to intensive computing. Many frameworks were designed to simulate large distributed computing infrastructures and the applications running on them. At the node level, some frameworks have also been proposed to simulate task-based parallel applications. However, one missing critical capability from these works is the ability to take Non-Uniform Memory Access (NUMA) effects into account, even though virtually every HPC platform nowadays exhibits such effects. We thus enhance an existing simulator for dependency-based task-parallel applications, that enables experimenting with multiple data locality models. We also introduce two localityaware performance models: we update a lightweight communication-oriented model that uses topology information to weight data transfers, and introduce a more complex communications and cache model that takes into account data storage in the LLC. We validate both models on dense linear algebra test cases and show that, on average, our simulator reproducibly predicts execution time with a small relative error.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    0
    Citations
    NaN
    KQI
    []