Jiesheng Wei, Anna Thomas, Guanpeng Li and Karthik Pattabiraman, Proceedings of the IEEE/IFIP International Conference on Dependable Systems and Networks (DSN), 2014. (Accept Rate: 30%) [ PDF | Talk ]
Abstract: Hardware errors are on the rise with reducing feature sizes, however tolerating them in hardware is expensive. Researchers have explored software-based techniques for building error resilient applications. Many of these techniques leverage application-specific resilience characteristics to keep overheads low. Understanding application-specific resilience characteristics requires software fault-injection mechanisms that are both accurate and capable of operating at a high-level of abstraction to allow developers to reason about error resilience.
In this paper, we quantify the accuracy of high-level software fault injection mechanisms vis-a-vis those that operate at the assembly or machine code levels. To represent high-level injection mechanisms, we built a fault injector tool based on the LLVM compiler, called LLFI. LLFI performs fault injection at the LLVM intermediate code level of the application, which is close to the source code. We quantitatively evaluate the accuracy of LLFI with respect to assembly level fault injection,and understand the reasons for the differences.