Improving Utilization and Parallelism of Hadoop Cluster by Elastic Containers

2018 
Modern datacenter schedulers apply a static policy to partition resources among different tasks. The amount of allocated resource won't get changed during a task's lifetime. However, we found that resource usage during a task's runtime demonstrates high dynamics and it only reaches full usage at few moments. Therefore, the static allocation policy doesn't exploit the dynamic nature of resource usage, leading to low system resource utilization. To address this hard problem, a recently proposed task-consolidation approach packs as many tasks as possible on the same node based on real-time resource demands. However, this approach may cause resource over-allocation and harm application performance. In this paper, we propose and develop ECS, an elastic container based scheduler that leverages resource usage variation within the task lifetime to exploit the potential utilization and parallelism. The key idea is to proactively select and shift tasks backward so that the inherent paralleled tasks can be identified without over-allocation. We formulate the scheduling scheme as an online optimization problem and solves it using a resource leveling algorithm. We have implemented ECS in Apache Yarn and performed evaluations with various MapReduce benchmarks in a cluster. Experimental results show that ECS can efficiently utilize resource and achieves up to 29% reduction on average job completion time while increasing CPU utilization by 25%, compared to stock Yarn.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    20
    References
    4
    Citations
    NaN
    KQI
    []