Tag Archives: Resilient

Improving the Accuracy of IR-Level Fault Injection

Lucas Palazzi, Guanpeng Li, Bo Fang, and Karthik Pattabiraman, To appear in the IEEE Transactions on Dependable and Secure Computing (TDSC). (Acceptance date: March 2020). [PDF]
Continue reading

A Tale of Two Injectors: End-to-End Comparison of IR-level and Assembly-Level Fault Injection

Lucas Palazzi, Guanpeng Li, Bo Fang, and Karthik Pattabiraman, IEEE International Symposium on Software Reliability Engineering (ISSRE), 2019. (Acceptance Rate: 31.4%) [ PDF | Talk ] (code)
Continue reading

BinFI: An Efficient Fault Injector for Safety-Critical Machine Learning Systems

Zitao Chen, Guanpeng Li, Karthik Pattabiraman, and Nathan DeBardeleben, The International Conference for High Performance Computing, Networking, Storage, and Analysis (SC), 2019. (Acceptance Rate: 21%) [ PDF | Talk ] ( Code )
Continue reading

BonVoision: Leveraging Spatial Data Smoothness for Recovery from Memory Soft Errors

Bo Fang, Hassan Halawa, Karthik Pattabiraman, Matei Ripeanu and Sriram Krishnamurthy, , Proceedings of the ACM International Conference on Supercomputing (ICS), 2019. (Acceptance Rate: 23.2 %). [ PDF | Talk ]
Continue reading

TensorFI: A Configurable Fault Injector for TensorFlow Applications

Guanpeng Li, Karthik Pattabiraman, and Nathan DeBardeleben, Workshop on Software Certification (WoSoCER), 2018, co-located with the IEEE International Symposium on Software Reliability Engineering (ISSRE). 2018. [ PDF | Talk Slides ] (Code)
Continue reading

Modeling Soft-Error Propagation in Programs

Guanpeng Li, Karthik Pattabiraman, Siva Kumar Sastry Hari, Michael Sullivan, and Timothy Tsai. IEEE/IFIP International Conference on Dependable Systems and Networks (DSN), 2018. (Acceptance Rate for Regular Papers: 25%) [ PDF | Talk ] (Link to Code) (Best Paper Runner up)
Continue reading

Modeling Input Dependent Error Propagation in Programs

Guanpeng Li and Karthik Pattabiraman, IEEE/IFIP International Conference on Dependable Systems and Networks (DSN), 2018. (Acceptance Rate for Regular Papers: 25%) [PDF | Talk] (Link to Code)
Continue reading

Understanding Error Propagation in Deep-Learning Neural Networks (DNN) Accelerators and Applications

Guanpeng Li, Siva Hari, Michael Sullivan, Timothy Tsai, Karthik Pattabiraman, Joel Emer, Stephen Keckler, International Conference for High-Performance Computing, Networking, Storage and Analysis (SC), 2017. (Acceptance Rate: 19%) [PDF | Talk] (Injector code)
Continue reading

LetGo: A Lightweight Continuous Framework for HPC Applications Under Failures

Bo Fang, Qiang Guan, Nathan Debardeleben, Karthik Pattabiraman, and Matei Ripeanu, ACM International Symposium on High-Performance Parallel and Distributed Computing (HPDC), 2017. (Acceptance Rate: 19%) [ PDF | Talk ]

Continue reading

One Bit is (Not) Enough: An Empirical Study of the Impact of Single and Multiple Bit-Flip Errors

Behrooz Sangchoolie, Karthik Pattabiraman, and Johan Karlsson, IEEE/IFIP International Conference on Dependable Systems and Networks (DSN), 2017. (Acceptance Rate: 23%). [ PDF | Talk ]

Continue reading