Neural architecture search has attracted wide attentions in both academia and industry. To accelerate it, researchers proposed weight-sharing methods which first train a super-network to reuse computation among different operators, from which exponentially many sub-networks can be sampled and efficiently evaluated. These methods enjoy great advantages in terms of computational costs, but the sampled sub-networks are not guaranteed to be estimated precisely unless an individual training process is taken. This paper owes such inaccuracy to the inevitable mismatch between assembled network layers, so that there is a random error term added to each estimation. We alleviate this issue by training a graph convolutional network to fit the performance of sampled sub-networks so that the impact of random errors becomes minimal. With this strategy, we achieve a higher rank correlation coefficient in the selected set of candidates, which consequently leads to better performance of the final architecture. In addition, our approach also enjoys the flexibility of being used under different hardware constraints, since the graph convolutional network has provided an efficient lookup table of the performance of architectures in the entire search space.
翻译:神经结构搜索吸引了学术界和工业界的广泛关注。 为了加速这项工作,研究人员提出了权重共享方法,首先训练一个超级网络,由不同操作者进行再利用计算,从中可以对许多子网络进行指数化的抽样和高效评估。这些方法在计算成本方面有很大优势,但抽样的子网络不能保证准确估算,除非采用个别培训程序。本文对组装网络层之间不可避免的不匹配存在这种不准确性,因此每个估计都添加了一个随机错误术语。我们通过培训图形革命网络来解决这个问题,以适应抽样子网络的性能,从而尽可能降低随机错误的影响。我们通过这一战略在选定的候选人中实现更高的等级相关系数,从而导致最终架构的更好性能。此外,我们的方法还具有在不同硬件限制下使用的灵活性,因为图形革命网络提供了整个搜索空间结构绩效的有效概览表。