Real-world data is laden with outlying values. The challenge for machine learning is that the learner typically has no prior knowledge of whether the feedback it receives (losses, gradients, etc.) will be heavy-tailed or not. In this work, we study a simple algorithmic strategy that can be leveraged when both losses and gradients can be heavy-tailed. The core technique introduces a simple robust validation sub-routine, which is used to boost the confidence of inexpensive gradient-based sub-processes. Compared with recent robust gradient descent methods from the literature, dimension dependence (both risk bounds and cost) is substantially improved, without relying upon strong convexity or expensive per-step robustification. Empirically, we also show that under heavy-tailed losses, the proposed procedure cannot simply be replaced with naive cross-validation. Taken together, we have a scalable method with transparent guarantees, which performs well without prior knowledge of how "convenient" the feedback it receives will be.
翻译:现实世界数据包含着外围值。 机器学习的挑战是, 学习者通常没有事先知道它收到的反馈( 损失、 梯度等) 是否是重尾的。 在这项工作中, 我们研究一个简单的算法战略, 当损失和梯度都可能是重尾时, 可以利用这种战略。 核心技术引入了一个简单有力的验证子常规, 用于增强廉价梯度子进程的信心。 与文献中最近稳健的梯度下降方法相比, 维度依赖性( 风险界限和成本) 得到大幅改善, 而不依赖于强的粘合性或昂贵的单步稳健化。 我们还巧妙地表明, 在重尾部损失下, 拟议的程序无法简单地被天真的交叉校准。 综合起来, 我们有一个具有透明保证的可扩展方法, 这种方法在不事先了解“ 兼容性” 的反馈的情况下运行良好。