Understanding the fundamental limits of robust supervised learning has emerged as a problem of immense interest, from both practical and theoretical standpoints. In particular, it is critical to determine classifier-agnostic bounds on the training loss to establish when learning is possible. In this paper, we determine optimal lower bounds on the cross-entropy loss in the presence of test-time adversaries, along with the corresponding optimal classification outputs. Our formulation of the bound as a solution to an optimization problem is general enough to encompass any loss function depending on soft classifier outputs. We also propose and provide a proof of correctness for a bespoke algorithm to compute this lower bound efficiently, allowing us to determine lower bounds for multiple practical datasets of interest. We use our lower bounds as a diagnostic tool to determine the effectiveness of current robust training methods and find a gap from optimality at larger budgets. Finally, we investigate the possibility of using of optimal classification outputs as soft labels to empirically improve robust training.
翻译:从实际和理论角度来看,了解强力监督学习的基本限制已成为一个引起极大兴趣的问题。特别是,在可能学习时,必须确定关于培训损失的分类-不可知的界限,以确定培训损失的较低界限。在本文件中,我们确定在试验-时间对手在场的情况下跨热带损失的最佳下界限,以及相应的最佳分类产出。我们作为优化问题解决办法的界限的表述十分笼统,足以涵盖取决于软分类产出的任何损失功能。我们还提议并证明,采用直言算法来高效率地计算这一较低界限,使我们能够确定多个实用数据集的较低界限。我们用我们较低的界限作为诊断工具,以确定当前稳健的培训方法的有效性,并找出在较大预算方面的最佳性差距。最后,我们研究是否有可能使用最佳分类产出作为软标签,用经验方式改进稳健的培训。