Building Robust ML Systems to Training Data Faults

Contact: Abraham Chan, email

Evaluating how ensembles improve ML resilience

Machine learning applications are deployed in many critical domains today. Unlike programmed software, the behaviour of ML applications is based on the training data provided. However, training data can be faulty, whether through human mistakes during data collection or through automated labelling.
Therefore, it is important to understand how faulty training data affect ML models, and how we could build more robust ML models to mitigate their effects.

Abraham Chan, Niranjhana Narayananan, Arpan Gujarati, Karthik Pattabiraman, and Sathish Gopalakrishnan, Understanding the Resilience of Neural Network Ensembles against Faulty Training Data, Proceedings of the IEEE International Symposium on Quality, Reliability and Security (QRS), 2021. (Code) Best Paper Award (1 of 3).

Training Data Faults

Machine learning (ML) is widely deployed in safety-critical systems (e.g. self-driving cars). Failures can have disastrous consequences in these systems, and hence ensuring the reliability of its operations is important. Mutation testing is a popular method for assessing the dependability of applications and tools have recently been developed for ML frameworks. However, the focus has been on improving the quality of test data. We present an open source data mutation tool, TensorFlow Data Mutator (TF-DM), which targets different kinds of data faults for any ML program written in TensorFlow 2. TF-DM supports different types of data mutators so users can study model resilience to data faults.

Niranjhana Narayanan, and Karthik Pattabiraman, TF-DM: Tool for Studying ML Model Resilience to Data Faults, DeepTest, 2021. [ PDF  | Code]

N-Version Programming in Machine Learning

We revisit N-Version Programming (NVP) in the context of machine learning (ML). Generating N versions of an ML component does not require additional programming effort, but only extra computations. This opens up the possibility of executing hundreds of diverse replicas, which, if carefully deployed, can improve their overall reliability by a significant margin. We use mathematical modeling to evaluate these benefits.

Arpan Gujarati, Sathish Gopalakrishnan, and Karthik Pattabiraman, New Wine in an Old Bottle: N-Version Programming for Machine Learning Components, IEEE International Workshop on Software Certification (WoSoCER), 2020. Held in conjunction with the IEEE International Symposium on Software Reliability Engineering (ISSRE), 2020. [PDF][Talk]