Neural Architecture Search has attracted increasing attention in recent years. Among them, differential NAS approaches such as DARTS, have gained popularity for the search efficiency. However, they still suffer from three main issues, that are, the weak stability due to the performance collapse, the poor generalization ability of the searched architectures, and the inferior robustness to different kinds of proxies. To solve the stability and generalization problems, a simple-but-effective regularization method, termed as Beta-Decay, is proposed to regularize the DARTS-based NAS searching process (i.e., $\beta$-DARTS). Specifically, Beta-Decay regularization can impose constraints to keep the value and variance of activated architecture parameters from being too large, thereby ensuring fair competition among architecture parameters and making the supernet less sensitive to the impact of input on the operation set. In-depth theoretical analyses on how it works and why it works are provided. Comprehensive experiments validate that Beta-Decay regularization can help to stabilize the searching process and makes the searched network more transferable across different datasets. To address the robustness problem, we first benchmark different NAS methods under a wide range of proxy data, proxy channels, proxy layers and proxy epochs, since the robustness of NAS under different kinds of proxies has not been explored before. We then conclude some interesting findings and find that $\beta$-DARTS always achieves the best result among all compared NAS methods under almost all proxies. We further introduce the novel flooding regularization to the weight optimization of $\beta$-DARTS (i.e., Bi-level regularization), and experimentally and theoretically verify its effectiveness for improving the proxy robustness of differentiable NAS.
翻译:近些年来,神经架构搜索吸引了越来越多的关注。 其中,不同的NASS方法,如DARTS,在搜索效率方面越来越受欢迎。然而,它们仍然受到三大问题的影响,即:由于性能崩溃导致的稳定性不高,搜索架构的概括能力差,以及不同种类的替代物的稳健性差。为了解决稳定性和一般化问题,一种简单但有效的正规化方法,称为Beta-Decay,旨在规范基于DARTS的NAS搜索进程(即$\beeta$-DARTS)。具体地说,Beta-Decay正规化可以施加限制,使激活架构参数的价值和差异不会太大,从而确保结构参数之间的公平竞争,并使超级网络对各种投入对运行集的影响不那么敏感。深入的理论分析表明,Beta-Decay正规化可以帮助稳定搜索进程,并使搜索网络在不同的数据集之间更加可转让。为了解决稳健性的问题,我们总是将所有代理结构的正值规则进行对比,因此,我们首先根据不同的NAS的代理数据层次来,我们根据不同层次对稳妥的正确的正确性分析方法,然后测量了。