In this work we revisit two classic high-dimensional online learning problems, namely linear regression and contextual bandits, from the perspective of adversarial robustness. Existing works in algorithmic robust statistics make strong distributional assumptions that ensure that the input data is evenly spread out or comes from a nice generative model. Is it possible to achieve strong robustness guarantees even without distributional assumptions altogether, where the sequence of tasks we are asked to solve is adaptively and adversarially chosen? We answer this question in the affirmative for both linear regression and contextual bandits. In fact our algorithms succeed where conventional methods fail. In particular we show strong lower bounds against Huber regression and more generally any convex M-estimator. Our approach is based on a novel alternating minimization scheme that interleaves ordinary least-squares with a simple convex program that finds the optimal reweighting of the distribution under a spectral constraint. Our results obtain essentially optimal dependence on the contamination level $\eta$, reach the optimal breakdown point, and naturally apply to infinite dimensional settings where the feature vectors are represented implicitly via a kernel map.
翻译:在这项工作中,我们从对抗性强力的角度重新审视了两个典型的高维在线学习问题,即线性回归和背景强盗。现有的算法强强的统计工作提供了强有力的分布假设,确保输入数据平均分布或来自一个良好的基因模型。即使没有完全的分布假设,我们能否实现强大的稳健性保障,即使没有完全的分布假设,我们被要求解决的任务的顺序是适应性和对抗性选择的?我们回答这个问题时,线性回归和背景强盗都是肯定的。事实上,我们的算法在常规方法失败时是成功的。特别是,我们展示了对Huber回归和更一般的 convex M-sestator的强大下下限。我们的方法基于一种新的交替最小化最小化计划,它将普通最小的平方与一个简单convex 程序相隔开来,该程序会发现在光谱限制下对分布的最佳再加权。我们的结果基本上是对污染水平 $\eta$, 达到最佳的崩溃点,并且自然适用于通过内核图暗代表特性矢量的无限维环境。