One-shot neural architecture search (NAS) substantially improves the search efficiency by training one supernet to estimate the performance of every possible child architecture (i.e., subnet). However, the inconsistency of characteristics among subnets incurs serious interference in the optimization, resulting in poor performance ranking correlation of subnets. Subsequent explorations decompose supernet weights via a particular criterion, e.g., gradient matching, to reduce the interference; yet they suffer from huge computational cost and low space separability. In this work, we propose a lightweight and effective local intrinsic dimension (LID)-based method NAS-LID. NAS-LID evaluates the geometrical properties of architectures by calculating the low-cost LID features layer-by-layer, and the similarity characterized by LID enjoys better separability compared with gradients, which thus effectively reduces the interference among subnets. Extensive experiments on NASBench-201 indicate that NAS-LID achieves superior performance with better efficiency. Specifically, compared to the gradient-driven method, NAS-LID can save up to 86% of GPU memory overhead when searching on NASBench-201. We also demonstrate the effectiveness of NAS-LID on ProxylessNAS and OFA spaces. Source code: https://github.com/marsggbo/NAS-LID.
翻译:单镜头神经结构搜索(NAS)通过培训一个超级网来估计每个可能的儿童结构(即子网)的性能,大大提高了搜索效率。然而,子网特性的不一致严重干扰了优化,导致子网的性能排名关系差。随后的勘探通过一个特定标准(例如梯度匹配)将超级网重量分解,以减少干扰;然而,它们受到计算成本巨大和低空间分离的影响。在这项工作中,我们建议采用一个轻巧和有效的本地内在层面(基于LID的)方法NAS-LID。NAS-LID通过计算低成本LID的地层特征来评估建筑的几何性质,而LID的相似性则与梯度相比,因此能够有效地减少子网之间的干扰。NASBench-201的广泛实验表明,NAS-LID与基于梯度驱动的方法相比,NAS-LIDLID可以将建筑结构的几何特性评估为几何特性,计算低成本的LID特性,逐层的层相类似性能与梯度比较,从而有效减少子网际网际网际系统/NAIS数据库的系统的内存。我们还可以搜索系统/SBSBSBSBS/SBR的8.6%。