一个代理设备已足够用于硬件软件软件神经结构搜索 (One Proxy Device Is Enough for Hardware-Aware Neural Architecture Search)

from arxiv, Accepted by the ACM SIGMETRICS 2022. Published in the Proceedings of the ACM on Measurement and Analysis of Computing Systems, vol. 5, no. 3, Article 34, December 2021

Convolutional neural networks (CNNs) are used in numerous real-world applications such as vision-based autonomous driving and video content analysis. To run CNN inference on various target devices, hardware-aware neural architecture search (NAS) is crucial. A key requirement of efficient hardware-aware NAS is the fast evaluation of inference latencies in order to rank different architectures. While building a latency predictor for each target device has been commonly used in state of the art, this is a very time-consuming process, lacking scalability in the presence of extremely diverse devices. In this work, we address the scalability challenge by exploiting latency monotonicity -- the architecture latency rankings on different devices are often correlated. When strong latency monotonicity exists, we can re-use architectures searched for one proxy device on new target devices, without losing optimality. In the absence of strong latency monotonicity, we propose an efficient proxy adaptation technique to significantly boost the latency monotonicity. Finally, we validate our approach and conduct experiments with devices of different platforms on multiple mainstream search spaces, including MobileNet-V2, MobileNet-V3, NAS-Bench-201, ProxylessNAS and FBNet. Our results highlight that, by using just one proxy device, we can find almost the same Pareto-optimal architectures as the existing per-device NAS, while avoiding the prohibitive cost of building a latency predictor for each device.

翻译：在很多现实应用中,例如基于视觉的自主驱动和视频内容分析等,都使用连锁神经网络。运行CNN对各种目标装置的推断,硬件智能神经结构搜索(NAS)至关重要。高效硬件智能神经结构搜索(NAS)的关键要求是快速评估导出延迟时间,以排列不同的结构。虽然为每个目标装置建立一个悬浮预测器(NCNS)在工艺状态中常用,但这是一个非常耗时的过程,在存在极其多样化的装置时缺乏可缩放性。在这项工作中,我们通过利用惯性单调性设备来应对可缩放性挑战。当强性硬性硬性硬性神经结构搜索系统(NAS)是一个关键要求,即快速评估导出导出延迟性延迟性延迟性静脉冲,同时在新目标装置上重新使用一个替代装置,但不会失去最佳性。在缺乏强性液态的单调时,我们建议一种高效的代用适应技术,以显著地提升悬浮性装置。最后,我们验证了我们的方法,并用不同平台的移动式S-S-S-ROQS-ROS-ROS-RO-RO-RO-RO-RO-RO-RO-RO-RO-RO-RO-RO-I-I-I-I-I-I-I-I-I-I-I-OL-I-I-I-I-I-I-I-I-I-I-MA-MA-MA-I-MA-MA-MA-MA-MA-MA-MA-MA-MA-MA-MA-MA-MA-MA-MA-MA-MA-MA-S-MA-MA-MA-MA-MA-MA-MA-MA-MA-MA-MA-MA-MA-MA-MA-MA-MA-MA-MA-MA-MA-MA-MA-MA-MA-MA-MA-MA-MA-MA-MA-MA-MA-MA-MA-MA-MA-MA-MA-MA-MA-MA-MA-MA-MA-S-MA-MA-MA-MA-MA-MA-MA-MA-MA-MA-MA-MA-MA-MA-MA-MA-MA-MA-MA-MA-MA-MA-MA-MA-MA-MA-MA