Hazem A. Abdelhafez, Hassan Halawa, Mohamed Ahmed, Karthik Pattabiraman, and Matei Ripeanu. ACM/IEEE Symposium on Edge Computing (SEC), 2021 (Acceptance Rate: 27%). [ PDF | Talk ]
Abstract: A common feature of devices deployed at the edge today is their configurability. The NVIDIA Jetson AGX, for example, has a user-configurable frequency range larger than one order of magnitude for the CPU, the GPU, and the memory controller. Key to make effective use of this configurability is the ability to anticipate the application-level impact of a frequency configuration choice. To this end, this paper presents a novel modeling approach for predicting the runtime and power consumption for convolutional neural networks (CNNs). This modeling approach is: (i) effective – i.e., makes predictions with low error (models achieve an average relative error of 15.4% for runtime and and 14.9% for energy; (ii) efficient – i.e., requires a low cost to make predictions; (iii) generic – i.e., supports deploying updated and possibly different deep learning inference models without the need for retraining, and (iv) practical – i.e., requires a low training cost. Three features, all geared towards meeting the challenges of deploying in a real-world environment, set this work apart: (i) the focus on predicting the impact of the frequency configuration choice, (ii) the methodological choice to aggregate predictions at fine (i.e., kernel level) granularity which provides generality; and (iii) taking into account the inter-node variability among nominally identical devices.