vPerfGuard: an automated model-driven framework for application performance diagnosis in consolidated cloud environments

2013 
Many business customers hesitate to move all their applications to the cloud due to performance concerns. White-box diagnosis relies on human expert experience or performance troubleshooting "cookbooks" to find potential performance bottlenecks. Despite wide adoption, the scalability and adaptivity of such approaches remain severely constrained, especially in a highly-dynamic, consolidated cloud environment. Leveraging the rich telemetry collected from applications and systems in the cloud, and the power of statistical learning, vPerfGuard complements the existing approaches with a model-driven framework by: (1) automatically identifying system metrics that are most predictive of application performance, and (2) adaptively detecting changes in the performance and potential shifts in the predictive metrics that may accompany such a change. Although correlation does not imply causation, the predictive system metrics point to potential causes that can guide a cloud service provider to zero in on the root cause. We have implemented vPerfGuard as a combination of three modules: a sensor module, a model building module, and a model updating module. We evaluate its effectiveness using different benchmarks and different workload types, specifically focusing on various resource (CPU, memory, disk I/O) contention scenarios that are caused by workload surges or "noisy neighbors". The results show that vPerfGuard automatically points to the correct performance bottleneck in each scenario, including the type of the contended resource and the host where the contention occurred.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    23
    References
    57
    Citations
    NaN
    KQI
    []