Karthik Pattabiraman, Zbigniew Kalbarczyk and Ravishankar Iyer. To appear in the Proceedings of the IEEE Transactions on Dependable and Secure Computing (TDSC). (Accepted on May 1, 2009). [ PDF File ]
Abstract: This paper presents a technique to derive and implement error detectors to protect an application from data errors. The error detectors are derived automatically using compiler-based static analysis from the backward program slice of critical variables in the program. Critical variables are defined as those that are highly sensitive to errors, and deriving error detectors for these variables provides high coverage for errors in any data value used in the program. The error detectors take the form of checking expressions and are optimized for each control flow path followed at runtime. The derived detectors are implemented using a combination of hardware and software and continuously monitor the application at runtime. If an error is detected at runtime, the application is stopped so as to prevent error propagation and enable a clean recovery. Experiments show that the derived detectors achieve low-overhead error detection while providing high coverage for errors that matter to the application.
This paper supercedes our conference paper.