An Auto-tuning with Adaptation of A64 Scalable Vector Extension for SPIRAL
2021
In this paper, we propose an auto-tuning (AT) system by adapting the A64 Scalable Vector Extension for SPIRAL to generate discrete Fourier transform (DFT) implementations. The performance of our method is evaluated using the Supercomputer "Flow" at Nagoya University. The A64 scalable vector extension applied DFT codes are up to 1.98 times faster than scalar DFT codes and up to 3.63 times higher in terms of the SIMD instruction rate. In addition, we obtain a factor of maximum speedup 2.32 by adapting proposed AT system for loop unrolling.
Keywords:
- Correction
- Source
- Cite
- Save
- Machine Reading By IdeaReader
12
References
0
Citations
NaN
KQI