Mohammad Rafiuzzaman, Julien Gascon-Samson, Karthik Pattabiraman, and Sathish Gopalakrishnan, To appear in the Proceedings of the ACM/SIGAPP International Conference on Applied Computing (SAC), 2019. Dependable, Adaptive, Distributed Systems (DADS) Track. (Acceptance Rate: 27.5%) [ PDF | Talk ]
Abstract: We present a technique to predict memory exhaustion failures in devices being built for the modern Internet of Things (IoT). These devices can run general purpose applications on the network edge for local data processing for minimizing latency, bandwidth and infrastructure costs, and addressing data safety and privacy concerns. However, being not optimized for all devices, such applications could result in sudden and unexpected memory exhaustion failures because of the limited available memory of those IoT devices. Proactive failure prediction techniques are therefore needed to detect the onset of such failures with sufficient lead time for adaptation of the application or its safe termination. Our memory failure prediction technique for applications running on IoT devices uses k-Nearest Neighbor (kNN) based machine learning models. We have validated our technique using two third-party applications and a real-world IoT simulation application on two different IoT platforms and on an Amazon EC2 t2.micro instance for both single and multi-tenancy use cases. Our results indicate that our technique significantly outperforms simpler threshold-based techniques: in our test applications, with 180 seconds of lead time, failures were accurately predicted with 88% recall at 74% precision for a single application failure and 76% recall at 71% precision for multi-tenancy failure.