Neural Architecture Search (NAS) has fostered the automatic discovery of state-of-the-art neural architectures. Despite the progress achieved with NAS, so far there is little attention to theoretical guarantees on NAS. In this work, we study the generalization properties of NAS under a unifying framework enabling (deep) layer skip connection search and activation function search. To this end, we derive the lower (and upper) bounds of the minimum eigenvalue of the Neural Tangent Kernel (NTK) under the (in)finite-width regime using a certain search space including mixed activation functions, fully connected, and residual neural networks. We use the minimum eigenvalue to establish generalization error bounds of NAS in the stochastic gradient descent training. Importantly, we theoretically and experimentally show how the derived results can guide NAS to select the top-performing architectures, even in the case without training, leading to a train-free algorithm based on our theory. Accordingly, our numerical validation shed light on the design of computationally efficient methods for NAS. Our analysis is non-trivial due to the coupling of various architectures and activation functions under the unifying framework and has its own interest in providing the lower bound of the minimum eigenvalue of NTK in deep learning theory.
翻译:神经结构搜索(NAS)促进了自动发现最先进的神经神经结构。 尽管NAS取得了进步, 但迄今为止对NAS的理论保障几乎没有多少关注。 在这项工作中, 我们在一个统一的框架内研究了NAS的概括性属性, 使( 深) 层的连接搜索和激活功能搜索能够( 深) 跳过连接。 为此, 我们根据( 无限的) 系统, 利用某种搜索空间, 包括混合激活功能、 完全连接和剩余神经网络, 取得了一定的进展。 我们使用最小的天值来确定NAS 在随机梯度下移训练中的一般误差界限。 重要的是, 我们理论上和实验性地展示了由此产生的结果如何指导NAS选择顶级结构, 即使在没有经过培训的情况下, 导致基于我们理论的无培训算法。 因此, 我们关于计算效率方法的设计的数字校对 NAS 系统 的数值校验显示灯光, 我们的分析是非三维的理论, 因为它在最小的革命性架构下, 提供了其最低值的革命性理论。