BayesTuner: Leveraging Bayesian Optimization For DNN Inference Configuration Selection

2021 
Deep learning sits at the core of many applications and products deployed on large-scale infrastructures such as data centers. Since the power consumption of data centers contributes significantly to operational costs and carbon footprint, it is essential to improve their power efficiency. To this end, both hardware platform and application should be configured properly. However, identifying the best configuration (e.g., DNN batch size, number of cores, and amount of memory allocated to the application automatically for a wide range of available options with affordable search cost is challenging, and employing an exhaustive approach to test all the possible configurations is unfeasible. To tackle this challenge, we introduce BayesTuner that employs Bayesian Optimization to estimate the performance model of deep neural network inference applications under different configurations with a few test runs. Having these models, BayesTuner is able to differentiate the optimal or near-optimal configurations from the rest of options. Using a real-world setup with various DNNs, we show how efficiently BayesTuner can explore the huge state space resulted by combination of control knobs, and minimize the power consumption of the system, while meeting the throughput constraint of different DNNs.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    0
    Citations
    NaN
    KQI
    []