Existing neural architecture search (NAS) methods often return an architecture with good search performance but generalizes poorly to the test setting. To achieve better generalization, we propose a novel neighborhood-aware NAS formulation to identify flat-minima architectures in the search space, with the assumption that flat minima generalize better than sharp minima. The phrase ``flat-minima architecture'' refers to architectures whose performance is stable under small perturbations in the architecture (e.g., replacing a convolution with a skip connection). Our formulation takes the ``flatness'' of an architecture into account by aggregating the performance over the neighborhood of this architecture. We demonstrate a principled way to apply our formulation to existing search algorithms, including sampling-based algorithms and gradient-based algorithms. To facilitate the application to gradient-based algorithms, we also propose a differentiable representation for the neighborhood of architectures. Based on our formulation, we propose neighborhood-aware random search (NA-RS) and neighborhood-aware differentiable architecture search (NA-DARTS). Notably, by simply augmenting DARTS with our formulation, NA-DARTS outperforms DARTS and achieves state-of-the-art performance on established benchmarks, including CIFAR-10, CIFAR-100 and ImageNet.
翻译:现有神经结构搜索(NAS) 方法通常返回一个具有良好搜索性能的建筑,但一般化到测试设置。为了实现更普遍的化化,我们建议采用一个具有邻居意识的新设计,以识别搜索空间中的平米结构,假设平面微型模型比尖度小型模型更全面。“平面-minima结构”的短语指的是在建筑小扰动下性能稳定的建筑(例如,用跳线连接取代混凝土 ) 。我们的设计将建筑的性能集中到这一建筑的周围,从而考虑到建筑的“膨胀性能 ” 。我们展示了一种原则性的方法,将我们的配方适用于现有的搜索算法,包括基于抽样的算法和基于梯度的算法。为了便利对基于梯度的算法的应用,我们还提议了建筑群落的可不同代表性。基于我们的构思,我们提议以邻面随机搜索(NA-RS)和邻区认识不同的建筑搜索(NA-DARS), 只需用我们设定的成像、 NA- NA-ARS 10 格式,包括D- CRA-CART-CARC-S 实现DRA-S 的成的DAR-CADAR-S-S-S-S-CADAR-S-S-CADAR-S-CADAR-S-S-S-S-S-S-S-S-CAS-S-CAS-CAR-S-CAR-CAS-S-S-S-S-S-S-S-CS-S-S-S-S-S-S-S-S-S-S-S-CADAR-S-S-S-S-S-S-S-S-S-S-S-S-C-S-S-S-S-S-S-S-S-S-S-S-CAS-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-C