In this work we revisit two classic high-dimensional online learning problems, namely regression and linear contextual bandits, from the perspective of adversarial robustness. Existing works in algorithmic robust statistics make strong distributional assumptions that ensure that the input data is evenly spread out or comes from a nice generative model. Is it possible to achieve strong robustness guarantees even without distributional assumptions altogether, where the sequence of tasks we are asked to solve is adaptively and adversarially chosen? We answer this question in the affirmative for both regression and linear contextual bandits. In fact our algorithms succeed where convex surrogates fail in the sense that we show strong lower bounds for the existing approaches. Our approach is based on a novel alternating minimization scheme that interleaves ordinary least-squares with a semidefinite program for finding appropriate reweightings of the distribution. Moreover we give extensions of our main results to infinite dimensional settings where the feature vectors are represented implicitly via a kernel map.
翻译:在这项工作中,我们重新审视了两个典型的高维在线学习问题,即从对抗性强力角度的回归和线性背景强盗。现有的算法强强的统计工作提供了强有力的分布假设,确保输入数据平均分布或来自一个良好的基因模型。即使没有完全分布假设,我们能否实现强大的稳健保障,即使没有完全分布假设,我们被要求解决的任务的顺序是适应性和对抗性选择的。我们回答这个问题时,对回归和线性背景强盗都是肯定的。事实上,我们的算法是成功的,因为对立式的代孕失败了,我们显示了现有方法的强小范围。我们的方法基于一种新的交替最小化计划,将普通最小区域与一个半定型程序相隔开来,以找到分布的适当再加权。此外,我们把我们的主要结果扩展至无限的维环境,其中特征矢量通过内核图暗代表。