Bo Fang, Hassan Halawa, Karthik Pattabiraman, Matei Ripeanu and Sriram Krishnamurthy, , Proceedings of the ACM International Conference on Supercomputing (ICS), 2019. (Acceptance Rate: 23.2 %). [ PDF | Talk ]
Continue reading
Tag Archives: Resilient
BonVoision: Leveraging Spatial Data Smoothness for Recovery from Memory Soft Errors
Comments Off on BonVoision: Leveraging Spatial Data Smoothness for Recovery from Memory Soft Errors
Filed under papers
TensorFI: A Configurable Fault Injector for TensorFlow Applications
Guanpeng Li, Karthik Pattabiraman, and Nathan DeBardeleben, Workshop on Software Certification (WoSoCER), 2018, co-located with the IEEE International Symposium on Software Reliability Engineering (ISSRE). 2018. [ PDF | Talk Slides ] (Code)
Continue reading
Comments Off on TensorFI: A Configurable Fault Injector for TensorFlow Applications
Filed under papers
Modeling Soft-Error Propagation in Programs
Guanpeng Li, Karthik Pattabiraman, Siva Kumar Sastry Hari, Michael Sullivan, and Timothy Tsai. IEEE/IFIP International Conference on Dependable Systems and Networks (DSN), 2018. (Acceptance Rate for Regular Papers: 25%) [ PDF | Talk ] (Link to Code) (Best Paper Runner up)
Continue reading
Comments Off on Modeling Soft-Error Propagation in Programs
Filed under papers
Modeling Input Dependent Error Propagation in Programs
Guanpeng Li and Karthik Pattabiraman, IEEE/IFIP International Conference on Dependable Systems and Networks (DSN), 2018. (Acceptance Rate for Regular Papers: 25%) [PDF | Talk] (Link to Code)
Continue reading
Comments Off on Modeling Input Dependent Error Propagation in Programs
Filed under papers
Understanding Error Propagation in Deep-Learning Neural Networks (DNN) Accelerators and Applications
Guanpeng Li, Siva Hari, Michael Sullivan, Timothy Tsai, Karthik Pattabiraman, Joel Emer, Stephen Keckler, International Conference for High-Performance Computing, Networking, Storage and Analysis (SC), 2017. (Acceptance Rate: 19%) [PDF | Talk] (Injector code)
Chosen for IEEE Top Picks in Test and Reliability (TPTR), 2023.
Continue reading
Comments Off on Understanding Error Propagation in Deep-Learning Neural Networks (DNN) Accelerators and Applications
Filed under papers
LetGo: A Lightweight Continuous Framework for HPC Applications Under Failures
Bo Fang, Qiang Guan, Nathan Debardeleben, Karthik Pattabiraman, and Matei Ripeanu, ACM International Symposium on High-Performance Parallel and Distributed Computing (HPDC), 2017. (Acceptance Rate: 19%) [ PDF | Talk ]
Comments Off on LetGo: A Lightweight Continuous Framework for HPC Applications Under Failures
Filed under papers
One Bit is (Not) Enough: An Empirical Study of the Impact of Single and Multiple Bit-Flip Errors
Behrooz Sangchoolie, Karthik Pattabiraman, and Johan Karlsson, IEEE/IFIP International Conference on Dependable Systems and Networks (DSN), 2017. (Acceptance Rate: 23%). [ PDF | Talk ]
Comments Off on One Bit is (Not) Enough: An Empirical Study of the Impact of Single and Multiple Bit-Flip Errors
Filed under papers
IPA: Error Propagation Analysis of Multi-threaded Programs Using Likely Invariants
Abraham Chan, Stefan Winter, Habib Saissi, Karthik Pattabiraman and Neeraj Suri. Proceedings of the IEEE International Conference on Software Testing, Verification and Validation (ICST), 2017. (Acceptance Rate: 27%) [PDF | Talk]
Continue reading
Comments Off on IPA: Error Propagation Analysis of Multi-threaded Programs Using Likely Invariants
Filed under papers
Configurable Detection of SDC-Causing Errors in Programs
Qining Lu, Guanpeng Li, Karthik Pattabiraman, Meeta Gupta and Jude Rivers, ACM Transactions on Embedded Computing Systems (TECS). [ PDF ]
Continue reading
Comments Off on Configurable Detection of SDC-Causing Errors in Programs
Filed under papers
Understanding Error Propagation in GPGPU Applications
Guanpeng Li, Karthik Pattabiraman, Chen-Yong Cher and Pradip Bose, International Conference for High-Performance Computing, Storage and Networking (SC), 2016. (Acceptance Rate: 18%) [PDF | Talk ] (Link to LLFI-GPU) Continue reading
Comments Off on Understanding Error Propagation in GPGPU Applications
Filed under papers