One of the key steps in Neural Architecture Search (NAS) is to estimate the performance of candidate architectures. Existing methods either directly use the validation performance or learn a predictor to estimate the performance. However, these methods can be either computationally expensive or very inaccurate, which may severely affect the search efficiency and performance. Moreover, as it is very difficult to annotate architectures with accurate performance on specific tasks, learning a promising performance predictor is often non-trivial due to the lack of labeled data. In this paper, we argue that it may not be necessary to estimate the absolute performance for NAS. On the contrary, we may need only to understand whether an architecture is better than a baseline one. However, how to exploit this comparison information as the reward and how to well use the limited labeled data remains two great challenges. In this paper, we propose a novel Contrastive Neural Architecture Search (CTNAS) method which performs architecture search by taking the comparison results between architectures as the reward. Specifically, we design and learn a Neural Architecture Comparator (NAC) to compute the probability of candidate architectures being better than a baseline one. Moreover, we present a baseline updating scheme to improve the baseline iteratively in a curriculum learning manner. More critically, we theoretically show that learning NAC is equivalent to optimizing the ranking over architectures. Extensive experiments in three search spaces demonstrate the superiority of our CTNAS over existing methods.
翻译:神经建筑搜索(NAS)的关键步骤之一是估算候选建筑的性能。 现有的方法要么直接使用验证性业绩,要么学习预测器来估计业绩。 但是,这些方法可能是计算成本昂贵或非常不准确的,可能会严重影响搜索效率和性能。 此外,由于很难说明在具体任务上准确表现的建筑,因此学习一个有希望的性能预测器往往不是三进制的,因为缺少标签数据。在本文中,我们争论说,也许没有必要估计国家建筑的绝对性能。相反,我们只需要了解一个建筑是否比基线的更好。然而,如何利用这种比较信息作为奖励和如何使用有限的标签数据,这可能会严重影响搜索效率和性能。此外,在本文中,我们建议采用新的对比性神经建筑搜索(CTNAS)方法,通过将各种结构的对比结果作为奖赏来进行建筑搜索。 具体地说,我们设计和学习一个神经建筑比较师(NAC)来计算候选人结构的绝对性业绩。 相反,我们只需要了解一个结构的概率是否比基线要好得多。 如何利用一个比标准级模型来学习一个比标准级的NAAC。 。 。我们更能展示一个比一个比CARC 学习一个比ARC 的模型更接近一个比一个模型。