Neural Architecture Search (NAS) has fostered the automatic discovery of neural architectures, which achieve state-of-the-art accuracy in image recognition. Despite the progress achieved with NAS, so far there is little attention to theoretical guarantees on NAS. In this work, we study the generalization properties of NAS under a unifying framework enabling (deep) layer skip connection search and activation function search. To this end, we derive the lower (and upper) bounds of the minimum eigenvalue of Neural Tangent Kernel under the (in)finite width regime from a search space including mixed activation functions, fully connected, and residual neural networks. Our analysis is non-trivial due to the coupling of various architectures and activation functions under the unifying framework. Then, we leverage the eigenvalue bounds to establish generalization error bounds of NAS in the stochastic gradient descent training. Importantly, we theoretically and experimentally show how the derived results can guide NAS to select the top-performing architectures, even in the case without training, leading to a training-free algorithm based on our theory. Accordingly, our numerical validation shed light on the design of computationally efficient methods for NAS.
翻译:神经结构搜索(NAS)促进了神经神经结构的自动发现,从而实现了图像识别的最新精确度。尽管NAS取得了进步,但迄今为止对NAS的理论保障很少重视。在这项工作中,我们研究了NAS在统一框架下的一般属性,使(深)层能够跳过连接搜索和激活功能搜索。为此,我们从(在)宽度制度下,从搜索空间,包括混合激活功能、完全连接和剩余神经网络,得出神经神经结构的最下(和上)的最小隐性值。我们的分析是非三角的,因为各种结构的结合和统一框架下的激活功能。然后,我们利用天值界限来确定NAS在蒸气梯度下下降训练中的一般误差界限。重要的是,我们从理论上和实验性的角度展示了得出的结果如何引导NAS在不经过培训的情况下选择高性的结构,从而导致基于我们理论的无培训的算法。因此,我们用数字验证系统用于高效的计算。