Memory system characterization of deep learning workloads

2019 
Deep neural networks (DNNs) have emerged as the prevalent approach for implementing learning tasks in many application domains. As DNN models become increasingly complex, the large amount of data generated during network computations exerts substantial pressure on the capacity and bandwidth of the memory subsystem. Consequently, memory hierarchy is quickly becoming a major bottleneck for DNN performance scaling. Optimizing the memory subsystem for DNN workloads requires a detailed understanding of the memory access behavior and the complex interactions between the processor and the memory hierarchy. Unfortunately, most of the existing memory profiling studies for DNNs rely either on high level software understanding of individual DNN layers or on analytical models. These approaches do not capture the complex feedback loop between the processor and the memory subsystem. To address this challenge, we have developed a simulation infrastructure, DLsim, which carries out detailed memory profiling of DNN workloads in a full-system simulation environment. In this paper, we report the key findings from our memory characterization analysis of five popular DNNs implemented in the widely used TensorFlow framework. Based on these findings, we identify multiple memory system optimization opportunities tailored for DNNs.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    23
    References
    2
    Citations
    NaN
    KQI
    []