One-shot neural architecture search (NAS) substantially improves the search efficiency by training one supernet to estimate the performance of every possible child architecture (i.e., subnet). However, the inconsistency of characteristics among subnets incurs serious interference in the optimization, resulting in poor performance ranking correlation of subnets. Subsequent explorations decompose supernet weights via a particular criterion, e.g., gradient matching, to reduce the interference; yet they suffer from huge computational cost and low space separability. In this work, we propose a lightweight and effective local intrinsic dimension (LID)-based method NAS-LID. NAS-LID evaluates the geometrical properties of architectures by calculating the low-cost LID features layer-by-layer, and the similarity characterized by LID enjoys better separability compared with gradients, which thus effectively reduces the interference among subnets. Extensive experiments on NASBench-201 indicate that NAS-LID achieves superior performance with better efficiency. Specifically, compared to the gradient-driven method, NAS-LID can save up to 86% of GPU memory overhead when searching on NASBench-201. We also demonstrate the effectiveness of NAS-LID on ProxylessNAS and OFA spaces. Source code:https://github.com/marsggbo/NAS-LID.
翻译:单镜头神经结构搜索(NAS)通过培训一个超级网来估计每个可能的儿童结构(即子网)的性能,大大提高了搜索效率。然而,子网特性的不一致严重干扰了优化,导致子网的性能排名关系差。随后的勘探通过一个特定标准(例如梯度匹配)将超级网重量分解,以减少干扰;然而,它们受到巨大的计算成本和低空间分离的影响。在这项工作中,我们建议一种轻量和有效的地方内在层面(基于LID的)方法NAS-LID.NAS-LIDLID评估建筑的几何性质,方法是计算低成本的LID特征层逐层,而LID的相似性则与梯度相比具有较好的相容性,从而有效地减少子网之间的干扰。NASBench-LID的广泛实验表明,与梯度驱动方法相比,NAS-LIDLID-LID的系统化方法相比,NAS-LID能够将建筑结构的几何特性特性评估,当我们搜索没有GPUSBS/NASBS/PRGIS的源码时,我们可以保存到86%。