A Parallel And Scalable Multi-FPGA based Architecture for High Performance Applications (Abstract Only)

2015 
Several industrial applications are becoming highly sophisticated and distributed as they capture and process real-time data from several sources at the same time. Furthermore, availability of acquisition channels such as I/O interfaces per FPGA, also dictates how applications are partitioned over several devices. Thus computationally intensive, resource consuming functions are implemented on multiple hardware accelerators, making low-latency communication to be a crucial factor. In such applications, communication between multiple devices means using high-speed point-to-point protocols with little flexibility in terms of communication scalability. The problem with the current systems is that, they are usually built to meet the needs of a specific application, i.e., lacks flexibility to change the communication topology or upgrade hardware resources. This leads to obsolescence, hardware redesign cost, and also wastes computing power. Taking this into consideration, we propose a scalable, modular and customizable computing platform, with a parallel full-duplex communication network, that redefines the computation and communication paradigm in such applications. We have implemented a scalable distributed secure H.264 encoding application with 3 channels over 3 customizable FPGA modules. In a distributed architecture, the inter-FPGA communication time is almost completely overshadowed by the overall execution time for bigger data-sets, and is comparable to the overall execution time of a non-distributed architecture, for the same implementation scaled down to 1 channel for 1 FPGA. This makes our architecture highly scalable and suitable for high-performance streaming applications. With 3 detachable FPGA modules, each sending and receive data simultaneously at 3 GB/s each, we measured the total net unidirectional traffic at any given time in the system is 9 GB/s, making the total net bidirectional bandwidth for 6 modules to be 36 GB/s.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    13
    References
    2
    Citations
    NaN
    KQI
    []