Sparkle: adaptive sample based scheduling for cluster computing

2015 
Traditional centralised scheduling has becoming unsuitable to analytics clusters with ever growing workload. As a promising alternative, sample based scheduling is highly scalable and, due to its decentralised nature, immune from becoming potential system bottleneck. However, the existing design could only function well in very specific application scenarios. Specifically, we argue that the performance of the baseline sample based scheduling method is sensitive to workload heterogeneity and the cluster's individual worker strength. In this work, we propose a novel method to reduce these sensitivities. We implement our method in the Sparkle scheduler and demonstrate our scheduler is capable of adapting to a much wider range of scenarios. Instead of introducing extra system costs, Sparkle's improved performance is gained by cutting unnecessary wastes and reducing the number sub-optimal scheduling decisions. Hence it could also serve as a foundation model for further studies in decentralised scheduling.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    15
    References
    3
    Citations
    NaN
    KQI
    []