Recently, the efficiency of automatic neural architecture design has been significantly improved by gradient-based search methods such as DARTS. However, recent literature has brought doubt to the generalization ability of DARTS, arguing that DARTS performs poorly when the search space is changed, i.e, when different set of candidate operators are used. Regularization techniques such as early stopping have been proposed to partially solve this problem. In this paper, we tackle this problem from a different perspective by identifying two contributing factors to the collapse of DARTS when the search space changes: (1) the correlation of similar operators incurs unfavorable competition among them and makes their relative importance score unreliable and (2) the optimization complexity gap between the proxy search stage and the final training. Based on these findings, we propose a new hierarchical search algorithm. With its operator clustering and optimization complexity match, the algorithm can consistently find high-performance architecture across various search spaces. For all the five variants of the popular cell-based search spaces, the proposed algorithm always obtains state-of-the-art architecture with best accuracy on the CIFAR-10, CIFAR-100 and ImageNet over other well-established DARTS-alike algorithms. Code is available at https://github.com/susan0199/StacNAS.
翻译:最近,自动神经结构设计的效率通过诸如DARSS等基于梯度的搜索方法大大提高了自动神经结构设计的效率。然而,最近的文献使人们怀疑DARSS的普遍化能力,认为DARSS在搜索空间改变时表现不佳,即使用了不同的候选操作员。提出了提前停止等常规化技术,以部分解决这一问题。在本文件中,我们从不同的角度来解决这一问题,从搜索空间变化时发现造成DARSS崩溃的两种因素:(1) 类似操作员的相互关系在它们之间引起了不受欢迎的竞争,使得它们的相对重要性不可靠;(2) 代理搜索阶段与最后培训之间的最优化复杂性差距。根据这些发现,我们提出一个新的等级搜索算法。随着操作员的组合和优化复杂性匹配,算法可以始终在不同搜索空间找到高性能的结构。对于以细胞为基础的搜索空间的所有五个变种,拟议的算法总是获得最精确的CARAR-10、CIFAR-100和图像网络在代理搜索阶段与最后培训阶段之间存在的复杂程度差距。我们根据这些发现,提出了一个新的等级搜索算法。随着操作者组合和优化的DAR-S01/ARC01/asmaliasmax。