We consider the algorithmic problem of finding the optimal weights and biases for a two-layer fully connected neural network to fit a given set of data points. This problem is known as empirical risk minimization in the machine learning community. We show that the problem is $\exists\mathbb{R}$-complete. This complexity class can be defined as the set of algorithmic problems that are polynomial-time equivalent to finding real roots of a polynomial with integer coefficients. Furthermore, we show that arbitrary algebraic numbers are required as weights to be able to train some instances to optimality, even if all data points are rational. Our results hold even if the following restrictions are all added simultaneously. $\bullet$ There are exactly two output neurons. $\bullet$ There are exactly two input neurons. $\bullet$ The data has only 13 different labels. $\bullet$ The number of hidden neurons is a constant fraction of the number of data points. $\bullet$ The target training error is zero. $\bullet$ The ReLU activation function is used. This shows that even very simple networks are difficult to train. The result explains why typical methods for $\mathsf{NP}$-complete problems, like mixed-integer programming or SAT-solving, cannot train neural networks to global optimality, unless $\mathsf{NP}=\exists\mathbb{R}$. We strengthen a recent result by Abrahamsen, Kleist and Miltzow [NeurIPS 2021].
翻译:我们考虑的是找到双层完全连接的神经网络的最佳权重和偏差以适合给定的数据点的算法问题。 这个问题被称为机器学习界的实验风险最小化。 我们显示, 问题在于 $\ existents\ mathb{R} $- 已完成。 这个复杂的类别可以定义为 算法问题组, 相当于 找到一个具有整数系数的多元数值的真正根基 。 此外, 我们显示, 任意的代数需要任意的代数, 才能使某些实例达到最佳性, 即使所有数据点都是合理的。 我们的结果即使同时添加了以下的限制, 也维持着。 $\ bull $\ bullb{ $\ $\ $。 完全有两个输入神经。 $\ balllete$ 数据只有13个不同的标签 。 $\\ bulllete$ 隐藏的神经元数量是数据点的固定部分。 $\ ball$\ bell$ 和 目标训练错误是零。 $\\ bull$ RELU imlentral 函数 ral presmaxral 。 除非 这个结果是很难 。 。 。 。 。 。 。 除非 robral_ 。