Karthik Pattabiraman, Nithin Nakka, Zbigniew Kalbarczyk and Ravishankar Iyer, To appear in the IEEE Transactions on Computers (TC). November 2013. [PDF file].
Abstract: This paper introduces SymPLFIED, a program-level framework that allows specification of arbitrary error detectors and the verification of their efficacy against hardware errors. SymPLFIED comprehensively enumerates all transient hardware errors in registers, memory, and computation (expressed symbolically as value errors) that potentially evade detection and cause program failure. The framework uses symbolic execution to abstract the state of erroneous values in the program and model checking to comprehensively find all errors that evade detection. We demonstrate the use of SymPLFIED on a widely deployed aircraft collision avoidance application, tcas. Our results show that the SymPLFIED framework can be used to uncover hard-to-detect catastrophic cases caused by transient errors in programs that may not be exposed by random fault injection based validation. Further, the errors exposed by the framework help us formulate a set of error detectors for the application to avoid the catastrophic case and other incorrect outcomes.