REPT: Reverse Debugging Of Failures In Deployed Software

Authors:
Weidong Cui Microsoft Research Redmond
Xinyang Ge Microsoft Research Redmond
Baris Kasikci University of Michigan
Ben Niu Microsoft Research Redmond
Upamanyu Sharma University of Michigan
Ruoyu Wang University of California Santa Barbara
Insu Yun Georgia Institute of Technology

Introduction:

In this paper, the authors present REPT, a practical system that enables reverse debugging of software failures in deployed systems.

Abstract:

Debugging software failures in deployed systems is important because they impact real users and customers. However, debugging such failures is notoriously hard in practice because developers have to rely on limited information such as memory dumps. The execution history is usually unavailable because high-fidelity program tracing is not affordable in deployed systems.In this paper, we present REPT, a practical system that enables reverse debugging of software failures in deployed systems. REPT reconstructs the execution history with high fidelity by combining online lightweight hardware tracing of a program's control flow with offline binary analysis that recovers its data flow. It is seemingly impossible to recover data values thousands of instructions before the failure due to information loss and concurrent execution. REPT tackles these challenges by constructing a partial execution order based on timestamps logged by hardware and iteratively performing forward and backward execution with error correction.We design and implement REPT, deploy it on Microsoft Windows, and integrate it into Windows Debugger. We evaluate REPT on 16 real-world bugs and show that it can recover data values accurately (92% on average) and efficiently (less than 20 seconds) for these bugs. We also show that it enables effective reverse debugging for 14 bugs.

You may want to know: