High-Performance Computing Optimization for Aladyn – Adaptive Neural Network Molecular Dynamics Mini-Application

2019 
This report provides a description and performance evaluation of the optimization techniques for high performance computing (HPC) implementation of the open source Computational Materials mini-application Aladyn (https://github.com/nasa/aladyn). Aladyn is a basic molecular dynamics code written in FORTRAN 2003, which is designed to demonstrate the use of adaptive neural networks (ANNs) in atomistic simulations. The role of ANNs is to efficiently reproduce the very complex energy landscape resulting from the atomic interactions in materials with the accuracy of the more expensive quantum mechanics-based calculations. The ANN is trained on a large set of atomic structures calculated using the density functional theory (DFT) method. While achieving orders of magnitude faster computational performance than DFT, the ANN-based approach was still very computationally demanding compared to the conventional approach of using empirically fitted energy functions. After its initial development, Aladyn was evaluated and optimized by experts at the NASA Advanced Supercomputing (NAS) division to exploit modern supercomputer architectures. The code has been optimized for execution on multicore central processing units (CPUs), including Intel® Skylake microarchitecture, and on graphic accelerators, such as Nvidia® V100 graphic processing units (GPUs), using Open Multi-Processing (OpenMP) and Open Accelerators (OpenACC) programming interfaces. The optimization achieved a speedup of 4.7 times the baseline version on CPU performance and an additional 2.4 times on CPU+GPU performance. Atomistic computer simulations are a fundamental tool in materials research to model material properties form physics-based first principles. Atomic interaction, governed by Quantum Mechanics (QM) require sophisticated and highly computationally demanding mathematical models to calculate [1]. Classical methods use approximate functional forms, empirically fitted through a set of variable parameters to emulate atomic energies as direct functions of atomic coordinates [2]. While empirical potentials are computationally much simpler, allowing simulations of large-scale systems of up to a trillion (1012) atoms [3], they are substantially less accurate compared to quantum calculations and applicable only to very specific atomic configurations or predefined crystallographic phases. A recently suggested approach is to use heuristic machine learning methods [4], such as those based on Adaptive Neural Networks (ANNs) to predict atomic energies, after being trained on a sufficiently large database of QM-calculated structures [5,6]. This approach reduces significantly the computational complexity, allowing for simulations of orders of magnitude larger systems compared to QM-based methods without compromising accuracy. Still, compared to classical methods using empirical energy functions, ANN methods remain two- to three orders of magnitude more computationally demanding. Hence, the computational cost of simulations, together with the need for extensive training of ANNs, still makes the practical implementation of ANN-based methods quite challenging. The purpose of the Aladyn mini-application software [7], available as open source at https://github.com/nasa/aladyn, is to be a testbed for exploring possible optimization strategies to develop highly scalable parallel algorithms for ANN-based atomistic simulations. Aladyn is aimed at utilizing the architecture of the high-end modern highperformance computing (HPC) hardware based on multicore central processing units (CPUs) equipped with graphic processing unit (GPU) accelerators. Specifically, the goal is to optimize the performance on a single HPC compute node, before implementing scaling to multi-node parallelization using message passing interface (MPI). At the same time, the open source code of Aladyn can serve as a training model for students and professors in academia.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    5
    References
    1
    Citations
    NaN
    KQI
    []