ISA mapper: a compute and hardware agnostic deep learning compiler

2019 
Domain specific accelerators present new challenges for code generation onto novel instruction sets, communication fabrics, and memory architectures. We introduce a shared intermediate representation to describe both deep learning programs and hardware capabilities, then formulate and apply instruction mapping to determine how a computation can be performed on a hardware system. Our scheduler chooses a specific mapping and determines data movement and computation order. With this system, we demonstrate automated extraction of matrix multiplication kernels from recent deep learning operations. We demonstrate 2--5X better performance on GEMM and GRU execution versus state-of-the-art on new hardware and up to 85% of state-of-the-art performance on existing hardware.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    8
    References
    6
    Citations
    NaN
    KQI
    []