In this work we demonstrate provable guarantees on the training of depth-2 neural networks in new regimes than previously explored. (1) First we give a simple stochastic algorithm that can train a ReLU gate in the realizable setting in linear time while using significantly milder conditions on the data distribution than previous results. Leveraging some additional distributional assumptions we also show approximate recovery of the true label generating parameters when training a ReLU gate while a probabilistic adversary is allowed to corrupt the true labels of the training data. Our guarantee on recovering the true weight degrades gracefully with increasing probability of attack and its nearly optimal in the worst case. Additionally our analysis allows for mini-batching and computes how the convergence time scales with the mini-batch size. (2) Secondly, we exhibit a non-gradient iterative algorithm "Neuro-Tron" which gives a first-of-its-kind poly-time approximate solving of a neural regression (here in the infinity-norm) problem at finite net widths and for non-realizable data.
翻译:在这项工作中,我们展示了在新制度下对深二神经网络进行培训的可靠保障。 (1) 首先,我们给出了简单的随机算法,可以在线性时间内在可实现的环境中对RELU门进行在线培训,同时在数据分布上使用比以前的结果温和得多得多的条件。 利用一些额外的分配假设,我们还显示在培训RELU门时,真实标签生成参数的恢复大致情况,而允许一个概率性对手腐蚀培训数据的真实标签。 我们关于恢复真实体重的保证优雅地下降,攻击概率增加,在最坏的情况下则几乎达到最佳程度。 此外,我们的分析还允许进行微型比对和计算如何与微型批量尺寸的合并时间尺度。 (2) 第二,我们展示了一种非高级迭代算式的“Neuro-Tron”算法,该算出在有限网宽度和不可实现的数据上首先解决神经回归问题(位于无限-诺姆)的多时的近似时间。