|Gueyoung Jung||AT&T Labs-Research, USA|
|Parisa Rahimzadeh||University of Colorado at Boulder, USA|
|Zhang Liu||University of Colorado Boulder, USA|
|Sangtae Ha||University of Colorado Boulder, USA|
|Kaustubh Joshi||AT&T Labs - Research, USA|
|Matti Hiltunen||AT&T Labs - Research, USA|
VM redundancy is the foundation of resilient cloud applications. While active-active approaches combined with load balancing and autoscaling are usually resource efficient, the stateful nature of many cloud applications often necessitates 1+1 (or 1+n) active-standby approaches. Keeping the standbys, however, could result in inefficient utilization of cloud resources. We explore an intriguing cloud-based solution, where standby VMs from active-standby applications are selectively overbooked to reduce resources reserved for failures. The approach requires careful VM placement to avoid a situation where multiple standby VMs activate simultaneously on the same host and thus cannot get the full resource entitlement. Indeed today's clouds do not have this visibility to the applications. We rectify this situation through ShadowBox, a novel redundancy-aware VM scheduler that optimizes the placement and activation of standby VMs, while assuring applications' resource entitlements. Evaluation on a large-scale cloud shows that ShadowBox can significantly improve resource utilization (i.e., more than 2.5 times than traditional approaches) while minimizing the impact on applications' entitlements.