Tag Archives: reliability

DoDOM: Leveraging DOM Invariants for Web 2.0 Application Robustness Testing

Karthik Pattabiraman and Benjamin Zorn, Proceedings of the International Symposium on Software Reliability Engineering (ISSRE), 2010.
[ PDF File ] [Talk slides]
You can find the technical report version of the paper here.
Continue reading

Talk – Good Enough Computer Systems: Reliability on the Cheap

Talk given at the IEEE Computer Society, Vancouver Chapter, May 20, 2010. [ PDF Slides ]
Continue reading

Automated Derivation of Application-specific Error Detectors Using Dynamic Analysis

Karthik Pattabiraman, Giacinto Paulo Saggese, Daniel Chen, Zbigniew Kalbarczyk and Ravishankar Iyer, IEEE Transactions on Dependable and Secure Computing (TDSC) Vol 8., Issue 5, Sept/Oct 2011 . [ PDF File ]
Continue reading

Formal Diagnosis of Hardware Transient Errors in Programs

Layali Rashid, Karthik Pattabiraman and Sathish Gopalakrishnan, Workshop on Silicon Errors in Logic, System Effects (SELSE), 2010. [ PDF File ][ Talk Slides ]
Continue reading

Towards Understanding the Effects of Intermittent Hardware Faults on Programs

Layali Rashid, Karthik Pattabiraman and Sathish Gopalakrishnan, Proceedings of the IEEE International Workshop on Dependable and Secure Nano-computing (WDSN), 2010. [ PDF File | Talk ]
Continue reading

An End-to-end Approach for the Automatic Derivation of Application-aware Error Detectors

Galen Lyle, Shelley Chen, Karthik Pattabiraman, Zbigniew Kalbarczyk and Ravishankar Iyer, Proceedings of the International Conference on Dependable Systems and Networks (DSN), 2009.
[ PDF File | Talk ]
Continue reading

Detecting and Tolerating Asymmetric Races

Paruj Ratanaworabhan, Martin Burtscher, Darko Kirovski, Rahul Nagpal, Benjamin Zorn and Karthik Pattabiraman, Proceedings of the International Symposium on the Principles and Practice of Parallel Programming (PPoPP), 2009. [ PDF File | Talk ]
You can find the technical report version here.
Continue reading

Automated Derivation of Application-aware Error and Attack Detectors

Karthik Pattabiraman, PhD thesis, University of Illinois at Urbana-Champaign (UIUC), May 2009.

Part 1 (Pages 1 – 160)
Part 2 (Pages 161 – 318)

Abstract : As computer systems become more and more complex, it becomes harder to ensure that they are dependable i.e. reliable and secure. Existing dependability techniques do not take into account the characteristics of the application and hence detect errors that may not manifest in the application. This results in wasteful detections and high overheads. In contrast to these techniques, this dissertation proposes a novel paradigm called “Application-Aware Dependability”, which leverages application properties to provide low-overhead, targeted detection of errors and attacks that impact the application. The dissertation focuses on derivation, validation and implementation of application-aware error and attack detectors.

The key insight in this dissertation is that certain data in the program is more important than other data from a reliability or security point of view (we call this the critical data). Protecting only the critical data provides significant performance improvements while achieving high detection coverage. The technique derives error and attack detectors to detect corruptions of critical data at runtime using a combination of static and dynamic approaches. The derived detectors are validated using both experimental approaches and formal verification. The experimental approaches validate the detectors using random fault-injection and known security attacks. The formal approach considers the effect of all possible errors and attacks according to a given fault or threat model and finds the corner cases that escape detection. The detectors have also been implemented in reconfigurable hardware in the context of the Reliability and Security Engine (RSE).

Automated Derivation and Hardware Implementation of Application-Specific Error Detectors

Karthik Pattabiraman, Giacinto Paolo Saggese, Daniel Chen, Zbigniew Kalbarczyk and Ravishankar Iyer, Workshop on Reliability Issues in High-Performance Computing (HPCRI), 2006.
[ PDF File | Talk ]

Super-ceded by the following conference paper.

Abstract: This paper proposes a novel technique for automated derivation of fine-grained, application-specific error detectors. An algorithm based on dynamic traces of application execution is developed for extracting the optimal set of error detectors for a target application. An automatic framework is proposed for synthesizing the derived detectors in hardware and enabling low-overhead run-time checking of the application execution. Coverage (evaluated using fault injection) of the error detectors obtained using the proposed methodology, the additional hardware resources, and performance overhead for several benchmark programs are also reported.

Position Paper – ToleRace: Tolerating and Detecting Races

Rahul Nagpal, Karthik Pattabiraman, Darko Kirovski and Benjamin Zorn, Second Workshop on Software Tools for Multi-core Systems (STMCS), 2007.
[ PDF File | Talk ]

This paper is super-ceded by the following conference paper

This paper introduces ToleRace, a software tool that increases the reliability of multi-threaded programs by tolerating or detecting race conditions. ToleRace modifies the implementation of critical sections at runtime to provide the following benefits. ToleRace allows programs with certain classes of races to operate as though the race did not exist. ToleRace probabilistically allows programmers to detect many of the remaining races when they happen, with low performance overhead. ToleRace achieves its ability to tolerate and detect races by judiciously duplicating shared data inside a critical section, thereby providing an illusion of atomicity when the shared data is updated. Our early experiments reveal that the performance overhead of ToleRace is considerably lower than existing dynamic race detection tools.