设备- Edge 共推论的最佳模型定位和在线模型分割 (Optimal Model Placement and Online Model Splitting for Device-Edge Co-Inference)

Device-edge co-inference opens up new possibilities for resource-constrained wireless devices (WDs) to execute deep neural network (DNN)-based applications with heavy computation workloads. In particular, the WD executes the first few layers of the DNN and sends the intermediate features to the edge server that processes the remaining layers of the DNN. By adapting the model splitting decision, there exists a tradeoff between local computation cost and communication overhead. In practice, the DNN model is re-trained and updated periodically at the edge server. Once the DNN parameters are regenerated, part of the updated model must be placed at the WD to facilitate on-device inference. In this paper, we study the joint optimization of the model placement and online model splitting decisions to minimize the energy-and-time cost of device-edge co-inference in presence of wireless channel fading. The problem is challenging because the model placement and model splitting decisions are strongly coupled, while involving two different time scales. We first tackle online model splitting by formulating an optimal stopping problem, where the finite horizon of the problem is determined by the model placement decision. In addition to deriving the optimal model splitting rule based on backward induction, we further investigate a simple one-stage look-ahead rule, for which we are able to obtain analytical expressions of the model splitting decision. The analysis is useful for us to efficiently optimize the model placement decision in a larger time scale. In particular, we obtain a closed-form model placement solution for the fully-connected multilayer perceptron with equal neurons. Simulation results validate the superior performance of the joint optimal model placement and splitting with various DNN structures.

翻译：设备前置设备共推论为资源受限制的无线装置(WD)提供了新的可能性,以便执行基于资源限制的无线装置(WD),执行基于神经网络的深度神经网络应用,且计算工作量繁重。特别是,WD执行DNN的最初几层,并将中间功能发送到用于处理DNN剩余层的边缘服务器。通过调整模型分割决定,当地计算成本和通信间接费用之间存在着一种权衡。在实践中,DNN模式在边缘服务器上重新培训并定期更新。DNN参数一旦重新生成,就必须将更新的模型的一部分置于WD,以便利在错误推论中作出判断。在本文中,我们研究模型放置和在线模型分割决定的共同功能的联合优化,以便在无线通道退缩的情况下最大限度地降低装置和时间。由于模型布置模式和模型分裂模型模型模型的功能紧密结合,同时在两个不同的时间尺度上进行更新。我们首先处理网络模型的分解,然后将问题放在WD上,在最优化的视野上放置,在最精确的层次上,我们用最精确的层次的层次上选择一个最精确的模型来进行一个最精确的顺序分析。

相关内容

MoDELS

关注 43

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

「数据数学:从理论到计算」EPFL硬核课程

专知会员服务

44+阅读 · 2021年1月31日

最新《非光滑优化》十讲硬核课程，剑桥大学梁经纬博士主讲

专知会员服务

33+阅读 · 2020年8月14日

Fariz Darari简明《博弈论Game Theory》介绍，35页ppt

专知会员服务

111+阅读 · 2020年5月15日