Device-edge collaboration on deep neural network (DNN) inference is a promising approach to efficiently utilizing network resources for supporting artificial intelligence of things (AIoT) applications. In this paper, we propose a novel digital twin (DT)-assisted approach to device-edge collaboration on DNN inference that determines whether and when to stop local inference at a device and upload the intermediate results to complete the inference on an edge server. Instead of determining the collaboration for each DNN inference task only upon its generation, multi-step decision-making is performed during the on-device inference to adapt to the dynamic computing workload status at the device and the edge server. To enhance the adaptivity, a DT is constructed to evaluate all potential offloading decisions for each DNN inference task, which provides augmented training data for a machine learning-assisted decision-making algorithm. Then, another DT is constructed to estimate the inference status at the device to avoid frequently fetching the status information from the device, thus reducing the signaling overhead. We also derive necessary conditions for optimal offloading decisions to reduce the offloading decision space. Simulation results demon-strate the outstanding performance of our DT-assisted approach in terms of balancing the tradeoff among inference accuracy, delay, and energy consumption.
翻译:暂无翻译