Understanding when neural networks can be learned efficiently is a fundamental question in learning theory. Existing hardness results suggest that assumptions on both the input distribution and the network's weights are necessary for obtaining efficient algorithms. Moreover, it was previously shown that depth-$2$ networks can be efficiently learned under the assumptions that the input distribution is Gaussian, and the weight matrix is non-degenerate. In this work, we study whether such assumptions may suffice for learning deeper networks and prove negative results. We show that learning depth-$3$ ReLU networks under the Gaussian input distribution is hard even in the smoothed-analysis framework, where a random noise is added to the network's parameters. It implies that learning depth-$3$ ReLU networks under the Gaussian distribution is hard even if the weight matrices are non-degenerate. Moreover, we consider depth-$2$ networks, and show hardness of learning in the smoothed-analysis framework, where both the network parameters and the input distribution are smoothed. Our hardness results are under a well-studied assumption on the existence of local pseudorandom generators.
翻译:在学习理论中,了解神经网络是否有效是一个根本问题。现有的硬度结果显示,对输入分布和网络重量的假设对于获得高效算法是必要的。此外,以前曾显示,在输入分布为高斯分布的假设下,深度-2美元的网络可以有效学习,重量矩阵是非降解的。在这项工作中,我们研究这些假设是否足以学习更深的网络并证明是负面的结果。我们显示,在高斯输入分布下的深度-3美元的ReLU网络即使在平滑的分析框架中也很难学习,因为在此框架内,网络参数中添加随机噪音。这意味着,在高斯分布下的深度-3美元的RELU网络即使重量矩阵是非降解的,也很难学习。此外,我们考虑深度-2万美元的网络,在平滑的分析框架中显示学习的难度,网络参数和输入分布都平滑。我们的硬性结果是在对当地假币发电机的存在进行认真研究的假设。