We study the parameterized complexity of training two-layer neural networks with respect to the dimension of the input data and the number of hidden neurons, considering ReLU and linear threshold activation functions. Albeit the computational complexity of these problems has been studied numerous times in recent years, several questions are still open. We answer questions by Arora et al. [ICLR '18] and Khalife and Basu [IPCO '22] showing that both problems are NP-hard for two dimensions, which excludes any polynomial-time algorithm for constant dimension. We also answer a question by Froese et al. [JAIR '22] proving W[1]-hardness for four ReLUs (or two linear threshold neurons) with zero training error. Finally, in the ReLU case, we show fixed-parameter tractability for the combined parameter number of dimensions and number of ReLUs if the network is assumed to compute a convex map. Our results settle the complexity status regarding these parameters almost completely.
翻译:我们在考虑ReLU和线性阈值激活函数的情况下,研究了相对于输入数据的维数和隐藏神经元数量的两层神经网络的参数化复杂度。尽管近年来已经多次研究了这些问题的计算复杂度,但仍有许多问题尚未解决。我们回答了Arora等人(ICLR'18)和Khalife和Basu(IPCO'22)的问题,证明了两个维度的两个问题都是NP难的,这排除了任何常数维度的多项式时间算法。我们还回答了Froese等人(JAIR'22)的一个问题,证明了ReLU为4个(或线性阈值神经元为2个)且训练误差为零的情况下W[1]-hardness的问题。最后,在ReLU情况下,我们证明了参数(维数和ReLU数量)的组合的固定参数可计算性,如果假定该网络计算凸映射。我们的结果几乎完全解决了这些参数的复杂性问题。