An 176.3 GOPs Object Detection CNN Accelerator Emulated in a 28nm CMOS Technology

2021 
Object Detection methods are an important subject in the implementations of Artificial Intelligent systems. Many attempts to build up a real-time object detection hardware/software in an SoC are presented in recent research. However, due to the demanding in both memory bandwidth and parallel computing resources, only few designs can fulfill the real-time requirement, which is important for applications or payloads such as Drones, UAVs, and autonomous vehicles. In this paper, issues in design of an SoC based object-detection will be discussed, and an on-going hardware design base on YOLO algorithm and ARC Platform will be presented. With the optimization in both numbers of Processing Element and Memory Bandwidth, An 176.3 GOPs CNN accelerator with 30 fps performance at 400MHz is presented. In addition to the Object-Detection engine, a ZCA image preprocessor and NMS postprocessing are also proposed to simplify the corresponding CNN model and enhance the real-time performance of Object-Detection. The emulation results and demonstration videos in the ARC platform will be presented. The post-layout simulation of current designverified the targeting real-time performance in a 28nm CMOS technology.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    0
    Citations
    NaN
    KQI
    []