Recent theoretical works on over-parameterized neural nets have focused on two aspects: optimization and generalization. Many existing works that study optimization and generalization together are based on neural tangent kernel and require a very large width. In this work, we are interested in the following question: for a binary classification problem with two-layer mildly over-parameterized ReLU network, can we find a point with small test error in polynomial time? We first show that the landscape of loss functions with explicit regularization has the following property: all local minima and certain other points which are only stationary in certain directions achieve small test error. We then prove that for convolutional neural nets, there is an algorithm which finds one of these points in polynomial time (in the input dimension and the number of data points). In addition, we prove that for a fully connected neural net, with an additional assumption on the data distribution, there is a polynomial time algorithm.
翻译:最近关于超参数神经网的理论工作集中在两个方面:优化和概括化。许多研究优化和一般化的现有工作都以神经相近内核为基础,需要非常宽的宽度。在这项工作中,我们感兴趣的问题是:对于两层轻度超分的ReLU网络的二进制分类问题,我们能否找到一个点,在多元时间里,这个点有小的测试错误?我们首先显示,具有明确规范化的丧失功能的景观具有以下属性:所有本地微型和某些仅固定在特定方向的其他点都具有小测试错误。我们随后证明,对于同源神经网,有一种算法在多元时间(投入层面和数据点数)中找到这些点之一。此外,我们证明,对于完全连接的神经网,除了对数据分布的附加假设外,还有一个多元时间算法。