Understanding the computational complexity of training simple neural networks with rectified linear units (ReLUs) has recently been a subject of intensive research. Closing gaps and complementing results from the literature, we present several results on the parameterized complexity of training two-layer ReLU networks with respect to various loss functions. After a brief discussion of other parameters, we focus on analyzing the influence of the dimension $d$ of the training data on the computational complexity. We provide running time lower bounds in terms of W[1]-hardness for parameter $d$ and prove that known brute-force strategies are essentially optimal (assuming the Exponential Time Hypothesis). In comparison with previous work, our results hold for a broad(er) range of loss functions, including $\ell^p$-loss for all $p\in[0,\infty]$. In particular, we extend a known polynomial-time algorithm for constant $d$ and convex loss functions to a more general class of loss functions, matching our running time lower bounds also in these cases.
翻译:理解以纠正线性单位(ReLUs)培训简单神经网络的计算复杂性最近已成为一项密集研究的主题。缩小差距和补充文献成果,我们就培训两层ReLU网络在各种损失功能方面的参数复杂性提出若干结果。在简要讨论其他参数之后,我们侧重于分析计算复杂性培训数据所涉维度[d]美元]的影响。我们从W[1]-硬度的角度为参数[d]提供运行时间较低的限制,并证明已知的布鲁特力战略基本上是最佳的(假设有见识的时波西斯 ) 。与以往的工作相比,我们的结果保持了广泛的损失功能范围,包括所有$p\ in[0,\infty]美元的损失。特别是,我们将已知的常数美元和 convex损失功能的多元时算法扩大到更普通的损失功能,与我们在这些情况下的运行时间范围相当。