The standard paradigm in Neural Architecture Search (NAS) is to search for a fully deterministic architecture with specific operations and connections. In this work, we instead propose to search for the optimal operation distribution, thus providing a stochastic and approximate solution, which can be used to sample architectures of arbitrary length. We propose and show, that given an architectural cell, its performance largely depends on the ratio of used operations, rather than any specific connection pattern in typical search spaces; that is, small changes in the ordering of the operations are often irrelevant. This intuition is orthogonal to any specific search strategy and can be applied to a diverse set of NAS algorithms. Through extensive validation on 4 data-sets and 4 NAS techniques (Bayesian optimisation, differentiable search, local search and random search), we show that the operation distribution (1) holds enough discriminating power to reliably identify a solution and (2) is significantly easier to optimise than traditional encodings, leading to large speed-ups at little to no cost in performance. Indeed, this simple intuition significantly reduces the cost of current approaches and potentially enable NAS to be used in a broader range of applications.
翻译:神经结构搜索(NAS)的标准范式是寻找具有特定操作和连接的完全确定性架构。 在这项工作中,我们提议搜索最优化的操作分布,从而提供一个可用作任意长度结构样本的随机和近似的解决办法。我们提议并显示,鉴于一个建筑单元,其性能在很大程度上取决于使用过操作的比例,而不是典型搜索空间中的任何具体连接模式;也就是说,操作顺序的细小变化往往无关紧要。这种直觉对于任何具体的搜索战略都是随机的,可以应用于一套不同的NAS算法。通过对4个数据集和4个NAS技术(Bayesian优化、不同搜索、本地搜索和随机搜索)的广泛验证,我们表明,操作分布 (1) 具有足够的差别性力量,足以可靠地确定解决方案,(2) 比传统编码更容易优化,导致在性操作过程中的大规模超速。事实上,这种简单直觉大大降低了当前方法的成本,并有可能使NAS在更广泛的应用中使用。