Machine learning algorithms are often used in environments which are not captured accurately even by the most carefully obtained training data, either due to the possibility of `adversarial' test-time attacks, or on account of `natural' distribution shift. For test-time attacks, we introduce and analyze a novel robust reliability guarantee, which requires a learner to output predictions along with a reliability radius $\eta$, with the meaning that its prediction is guaranteed to be correct as long as the adversary has not perturbed the test point farther than a distance $\eta$. We provide learners that are optimal in the sense that they always output the best possible reliability radius on any test point, and we characterize the reliable region, i.e. the set of points where a given reliability radius is attainable. We additionally analyze reliable learners under distribution shift, where the test points may come from an arbitrary distribution Q different from the training distribution P. For both cases, we bound the probability mass of the reliable region for several interesting examples, for linear separators under nearly log-concave and s-concave distributions, as well as for smooth boundary classifiers under smooth probability distributions.
翻译:机器学习算法经常在甚至最仔细获取的训练数据也无法准确捕捉的环境中使用,这可能是由于可能的“对抗性”测试时间攻击,或者由于“自然”分布转移。对于测试时间攻击,我们引入并分析了一种新颖的鲁棒可靠性保证,它要求学习者输出带有可靠性半径$\eta$的预测,这意味着只要对手没有对测试点进行超过距离$\eta$的扰动,其预测就是正确的。我们提供了在任何测试点上始终输出最佳可靠性半径的最优学习器,并且我们表征了可靠域,即可达到给定可靠半径的点集。我们还分析了在分布转移下的可靠学习,其中测试点可能来自于与训练分布P不同的任意分布Q。对于两种情况,我们对几个有趣的例子,如线性分离器在近似对数凸和s凸分布下,以及光滑界限分类器在光滑概率分布下,估计了可靠域的概率质量。