One size does not fit all: Implementation trade-offs for iterative stencil computations on FPGAs

Gaël Deest,Tomofumi Yuki,Sanjay V. Rajopadhye,Steven Derrien

One size does not fit all: Implementation trade-offs for iterative stencil computations on FPGAs

2017

Gaël Deest
Tomofumi Yuki
Sanjay V. Rajopadhye
Steven Derrien

Iterative stencils are kernels in various application domains such as numerical simulations and medical imaging, that merit FPGA acceleration. The best architecture depends on many factors such as the target platform, off-chip memory bandwidth, problem size, and performance requirements. We generate a family of FPGA stencil accelerators targeting emerging System on Chip platforms, (e.g., Xilinx Zynq or Intel SoC). Our designs come with design knobs to explore trade-offs. We also propose performance models to hone in on the most interesting design points, and show how they accurately lead to optimal designs. The optimal choice depends on problem sizes and performance goals.

Keywords:

Computer science
System on a chip
Parallel computing
Optimal design
Computation
Memory management
Real-time computing
Field-programmable gate array
Memory bandwidth
Stencil
Architecture
Bandwidth (signal processing)

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations