Collaborative inference systems are one of the emerging solutions for deploying deep neural networks (DNNs) at the wireless network edge. Their main idea is to divide a DNN into two parts, where the first is shallow enough to be reliably executed at edge devices of limited computational power, while the second part is executed at an edge server with higher computational capabilities. The main advantage of such systems is that the input of the DNN gets compressed as the subsequent layers of the shallow part extract only the information necessary for the task. As a result, significant communication savings can be achieved compared to transmitting raw input samples. In this work, we study early exiting in the context of collaborative inference, which allows obtaining inference results at the edge device for certain samples, without the need to transmit the partially processed data to the edge server at all, leading to further communication savings. The central part of our system is the transmission-decision (TD) mechanism, which, given the information from the early exit, and the wireless channel conditions, decides whether to keep the early exit prediction or transmit the data to the edge server for further processing. In this paper, we evaluate various TD mechanisms and show experimentally, that for an image classification task over the wireless edge, proper utilization of early exits can provide both performance gains and significant communication savings.
翻译:暂无翻译