AdaSpring: 用于移动应用的背景适应性和运行时间进化深度模型压缩模型 (AdaSpring: Context-adaptive and Runtime-evolutionary Deep Model Compression for Mobile Applications)

There are many deep learning (e.g., DNN) powered mobile and wearable applications today continuously and unobtrusively sensing the ambient surroundings to enhance all aspects of human lives. To enable robust and private mobile sensing, DNN tends to be deployed locally on the resource-constrained mobile devices via model compression. The current practice either hand-crafted DNN compression techniques, i.e., for optimizing DNN-relative performance (e.g., parameter size), or on-demand DNN compression methods, i.e., for optimizing hardware-dependent metrics (e.g., latency), cannot be locally online because they require offline retraining to ensure accuracy. Also, none of them have correlated their efforts with runtime adaptive compression to consider the dynamic nature of the deployment context of mobile applications. To address those challenges, we present AdaSpring, a context-adaptive and self-evolutionary DNN compression framework. It enables the runtime adaptive DNN compression locally online. Specifically, it presents the ensemble training of a retraining-free and self-evolutionary network to integrate multiple alternative DNN compression configurations (i.e., compressed architectures and weights). It then introduces the runtime search strategy to quickly search for the most suitable compression configurations and evolve the corresponding weights. With evaluation on five tasks across three platforms and a real-world case study, experiment outcomes show that AdaSpring obtains up to 3.1x latency reduction, 4.2 x energy efficiency improvement in DNNs, compared to hand-crafted compression techniques, while only incurring <= 6.2ms runtime-evolution latency.

翻译：目前有许多深层次的学习(例如 DNN) 动力化的移动和磨损平台(例如, DNN), 今天不断且不受干扰地对周围环境进行感测, 以提高人类生活的各个方面。为了能够进行强力和私人的移动感测, DNN 往往会通过模型压缩在资源有限的移动设备上就地部署。目前的做法是手工制作的 DNN 压缩技术, 即优化 DNNN 反应性能(例如, 参数大小), 或按需的 DNN 压缩方法, 即优化依靠硬件的度量度( 例如, 延缓度) 无法在本地在线进行, 因为它们需要离线的再培训以确保准确性。此外, 其中没有一个 DNNNN 将工作与运行的适应性压缩工作结合起来, 即优化 DNNN- 反应性能性能(例如, 参数大小), 或调控 DNNNF 压缩方法, 仅让运行的适应性 DNNN, 升级到本地的调控性, 升级,, 升级为本地的改进, 。具体, 它展示对再再更新和自我升级的升级的自我升级的自我升级的升级的升级的升级和自我升级的自我升级的自我升级的自我升级的递增压的递增压的递增压的