改进不同神经结构搜索的一致性、效率和灵活性 (Towards Improving the Consistency, Efficiency, and Flexibility of Differentiable Neural Architecture Search)

Most differentiable neural architecture search methods construct a super-net for search and derive a target-net as its sub-graph for evaluation. There exists a significant gap between the architectures in search and evaluation. As a result, current methods suffer from an inconsistent, inefficient, and inflexible search process. In this paper, we introduce EnTranNAS that is composed of Engine-cells and Transit-cells. The Engine-cell is differentiable for architecture search, while the Transit-cell only transits a sub-graph by architecture derivation. Consequently, the gap between the architectures in search and evaluation is significantly reduced. Our method also spares much memory and computation cost, which speeds up the search process. A feature sharing strategy is introduced for more balanced optimization and more efficient search. Furthermore, we develop an architecture derivation method to replace the traditional one that is based on a hand-crafted rule. Our method enables differentiable sparsification, and keeps the derived architecture equivalent to that of Engine-cell, which further improves the consistency between search and evaluation. Besides, it supports the search for topology where a node can be connected to prior nodes with any number of connections, so that the searched architectures could be more flexible. For experiments on CIFAR-10, our search on the standard space requires only 0.06 GPU-day. We further have an error rate of 2.22% with 0.07 GPU-day for the search on an extended space. We can also directly perform the search on ImageNet with topology learnable and achieve a top-1 error rate of 23.8% in 2.1 GPU-day.

翻译：最有差异的神经结构搜索方法为搜索建立一个超级网,并得出一个目标网,作为评估的子图。在搜索和评估方面,各结构之间存在巨大的差距。因此,目前的方法存在不一致、低效和不灵活的搜索过程。在本文中,我们引入了由引擎细胞和中转细胞组成的EnTranNAS。引擎细胞可以进行建筑搜索,而中转细胞只能通过结构衍生的子图进行。因此,搜索和评估结构之间的差距大大缩小。我们的方法还节省了大量的内存和计算成本,从而加快搜索进程。因此,目前的方法存在不一致、低效和不灵活的搜索过程。此外,我们开发了一种由引擎细胞和中转细胞构成的传统模型衍生方法。我们的方法可以使结构变得不同,而结构与引擎细胞相仿而成的子图,从而进一步改善搜索和评估的一致性。此外,我们的方法也为搜索表层学提供了大量的记忆和计算成本, 也为搜索速度提供了更多的记忆和计算成本。我们没有时间上的搜索速度, 在G- 25 之前的搜索中,我们只能进行更灵活的搜索。