Three Steps Is All You Need: Fast, Accurate, Automatic Scaling Decisions For Distributed Streaming Dataflows

Authors:
Vasiliki Kalavri ETH Zurich
John Liagouris ETH Zurich
Moritz Hoffmann ETH Zurich
Desislava Dimitrova ETH Zurich
Matthew Forshaw Newcastle University
Timothy Roscoe ETH Zurich

Introduction:

Streaming computations are by nature long-running, and their workloads can change in unpredictable ways.the authors present DS2, an automatic scaling controller which combines a general performance model of streaming dataflows with lightweight instrumentation to estimate the true processing and output rates of individual dataflow operators

Abstract:

Streaming computations are by nature long-running, and their workloads can change in unpredictable ways. This in turn means that maintaining performance may require dynamically scaling allocated computational resources. Some modern large-scale stream processors allow dynamic scaling but typically leave the difficult task of deciding how much to scale to the user. The process is cumbersome, slow and often inefficient. Where automatic scaling is supported, policies rely on coarse-grained metrics like observed throughput, backpressure, and CPU utilization. As a result, they tend to show incorrect provisioning, oscillations, and long convergence times. We present DS2, an automatic scaling controller for such systems which combines a general performance model of streaming dataflows with lightweight instrumentation to estimate the true processing and output rates of individual dataflow operators. We apply DS2 on Apache Flink and Timely Dataflow and demonstrate its accuracy and fast convergence. When compared to Dhalion, the state-of-the-art technique in Heron, DS2 converges to the optimal, backpressure-free configuration in a single step instead of six.

You may want to know: