Fine-Grained Fault Tolerance for Resilient pVM-Based Virtual Machine Monitors

2020 
Virtual machine monitors (VMMs) play a crucial role in the software stack of cloud computing platforms: their design and implementation have a major impact on performance, security and fault tolerance. In this paper, we focus on the latter aspect (fault tolerance), which has received less attention, although it is now a significant concern. Our work aims at improving the resilience of the "pVM-based" VMMs, a popular design pattern for virtualization platforms. In such a design, the VMM is split into two main components: a bare-metal hypervisor and a privileged guest virtual machine (pVM). We highlight that the pVM is the least robust component and that the existing fault-tolerance approaches provide limited resilience guarantees or prohibitive overheads. We present three design principles (disaggregation, specialization, and pro-activity), as well as optimized implementation techniques for building a resilient pVM without sacrificing end-user application performance. We validate our contribution on the mainstream Xen platform.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    27
    References
    0
    Citations
    NaN
    KQI
    []