Mr. Wolf: A 1 GFLOP/s Energy-Proportional Parallel Ultra Low Power SoC for IOT Edge Processing

2018 
We present Mr. Wolf, a Parallel Ultra Low Power (PULP) SoC featuring a hierarchical architecture with a small (12KG) microcontroller class RISC-V core augmented with an autonomous IO subsystem for efficient data transfer from a wide set of peripherals. The small core can offload compute-intensive kernels to an 8-cores floating-point capable processing engine available on demand. The proposed SoC, implemented in a 40 nm LP CMOS technology, features a 108 µW fully retentive memory (512 kB). The IO subsystem is capable of transferring up to 1.6Gbit/s in less than 2.5mW. The 8-core compute cluster achieves a peak performance of 850 millions of 32-bit integer multiply and accumulate per second (MMAC/s), 500 millions of 32-bit floating-point multiply and accumulate per second (MFMAC/s)-1 GFLOP/s - with an energy-efficiency up to 15 MMAC/s/mW and 9 MFMAC/s/mW. These building blocks are supported by aggressive on-chip power conversion and management, enabling energy-proportional heterogeneous computing for always-ON IOT end-nodes improving performance by several orders of magnitude with respect to traditional single core MCUs within a power envelope of 153 mW.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    7
    References
    29
    Citations
    NaN
    KQI
    []