Though learning has become a core component of modern information processing, there is now ample evidence that it can lead to biased, unsafe, and prejudiced systems. The need to impose requirements on learning is therefore paramount, especially as it reaches critical applications in social, industrial, and medical domains. However, the non-convexity of most modern statistical problems is only exacerbated by the introduction of constraints. Whereas good unconstrained solutions can often be learned using empirical risk minimization, even obtaining a model that satisfies statistical constraints can be challenging. All the more so, a good one. In this paper, we overcome this issue by learning in the empirical dual domain, where constrained statistical learning problems become unconstrained and deterministic. We analyze the generalization properties of this approach by bounding the empirical duality gap -- i.e., the difference between our approximate, tractable solution and the solution of the original (non-convex) statistical problem -- and provide a practical constrained learning algorithm. These results establish a constrained counterpart to classical learning theory, enabling the explicit use of constraints in learning. We illustrate this theory and algorithm in rate-constrained learning applications arising in fairness and adversarial robustness.
翻译:虽然学习已成为现代信息处理工作的核心组成部分,但现在有充分证据表明,学习可以导致有偏见、不安全和偏见的系统。因此,对学习施加要求至关重要,特别是因为它在社会、工业和医疗领域达到关键应用。然而,大多数现代统计问题的不协调性只能因引入限制而加剧。尽管利用经验风险最小化,往往可以学习好的、没有限制的解决办法,即使获得一种满足统计限制的模型,也可能具有挑战性。更是好的。在本文中,我们通过在经验双领域学习克服了这一问题,在这个领域,有限的统计学习问题变得不受约束和确定性。我们通过将经验的双重性差距 -- -- 即我们的粗略、易行的解决办法和原始(非中央)统计问题的解决办法之间的差别 -- -- 与实际的限制性学习算法加以区分,从而在公平和敌对性强健健性的情况下,我们分析了这种方法的通用性特性。我们用这种理论和算法在受费率限制的学习应用中说明了这种理论和算法。