Towards Comprehensive Traffic Forecasting in Cloud Computing: Design and Application

2016 
In this paper, we present our effort towards comprehensive traffic forecasting for big data applications using external, light-weighted file system monitoring. Our idea is motivated by the key observations that rich traffic demand information already exists in the log and meta-data files of many big data applications, and that such information can be readily extracted through run-time file system monitoring. As the first step, we use Hadoop as a concrete example to explore our methodology and develop a system called HadoopWatch to predict traffic demands of Hadoop applications. We further implement HadoopWatch in a small-scale testbed with 10 physical servers and 30 virtual machines. Our experiments over a series of MapReduce applications demonstrate that HadoopWatch can forecast the traffic demand with almost 100% accuracy and time advance. Furthermore, it makes no modification on the Hadoop framework, and introduces little overhead to the application performance. Finally, to showcase the utility of accurate traffic prediction made by HadoopWatch, we design and implement a simple HadoopWatch-enabled network optimization module into the HadoopWatch controller, and with realistic Hadoop job benchmarks we find that even a simple algorithm can leverage the forecasting results provided by HadoopWatch to significantly improve the Hadoop job completion time by up to 14.72%.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    48
    References
    9
    Citations
    NaN
    KQI
    []