动态-OFA:运行时间 DNN 建筑结构转换,以异种嵌入式平台进行性能提升 (Dynamic-OFA: Runtime DNN Architecture Switching for Performance Scaling on Heterogeneous Embedded Platforms)

Mobile and embedded platforms are increasingly required to efficiently execute computationally demanding DNNs across heterogeneous processing elements. At runtime, the available hardware resources to DNNs can vary considerably due to other concurrently running applications. The performance requirements of the applications could also change under different scenarios. To achieve the desired performance, dynamic DNNs have been proposed in which the number of channels/layers can be scaled in real time to meet different requirements under varying resource constraints. However, the training process of such dynamic DNNs can be costly, since platform-aware models of different deployment scenarios must be retrained to become dynamic. This paper proposes Dynamic-OFA, a novel dynamic DNN approach for state-of-the-art platform-aware NAS models (i.e. Once-for-all network (OFA)). Dynamic-OFA pre-samples a family of sub-networks from a static OFA backbone model, and contains a runtime manager to choose different sub-networks under different runtime environments. As such, Dynamic-OFA does not need the traditional dynamic DNN training pipeline. Compared to the state-of-the-art, our experimental results using ImageNet on a Jetson Xavier NX show that the approach is up to 3.5x (CPU), 2.4x (GPU) faster for similar ImageNet Top-1 accuracy, or 3.8% (CPU), 5.1% (GPU) higher accuracy at similar latency.

翻译：移动和嵌入平台越来越需要高效率地在各个不同处理元素中执行计算要求的 DNN 。运行时, DNN 可用的硬件资源可能因其他同时运行的应用而有很大差异。应用程序的性能要求也可能在不同情况下发生变化。为了实现预期的性能,提出了动态 DNN 。为了实现预期的性能, 提出了动态 DNN 。动态 DNN 可以实时缩放频道/ 级数, 以满足不同资源制约下的不同要求。但是, 动态 DNNN 的培训过程可能成本高昂, 因为不同部署情景的平台认知模型必须重新培训, 才能成为动态 DNNNNNNE培训管道。与最先进的、最先进的平台的 DNNNNNNNW 模式(即一次性网络) 相比, 动态- OFA 预示一系列子网络, 以静态的 OFA主干模式为基础, 并包含在不同的运行时环境中选择不同子网络的运行时间管理器管理器。因此, DVCP- 不需要传统的动态 DNNNNNE培训管道。与NA- Art- Art- g- g- g- salx 快速显示类似图像- g- 的图像- g- hyal- hexx 的图像- hyal- hyal- syal- syal- syal- 10- sypilental 的图像- syal- sal- sal- sal- sal- sal- syal- lax lax lax lax lax lax lax lax 的图像- lax lax lax sh sh sh sh 显示在不同的图像- syal- sal- 方法, 至 2. ax lax lax labal lax lax lax lax labal lax lax labal lax labal lax lax lax lax lax lax lax lax lax lax lax lax lax lax lax lax lax la