" 机器学习方面的最后损失:理论和应用 " (On Tilted Losses in Machine Learning: Theory and Applications)

Exponential tilting is a technique commonly used in fields such as statistics, probability, information theory, and optimization to create parametric distribution shifts. Despite its prevalence in related fields, tilting has not seen widespread use in machine learning. In this work, we aim to bridge this gap by exploring the use of tilting in risk minimization. We study a simple extension to ERM -- tilted empirical risk minimization (TERM) -- which uses exponential tilting to flexibly tune the impact of individual losses. The resulting framework has several useful properties: We show that TERM can increase or decrease the influence of outliers, respectively, to enable fairness or robustness; has variance-reduction properties that can benefit generalization; and can be viewed as a smooth approximation to the tail probability of losses. Our work makes rigorous connections between TERM and related objectives, such as Value-at-Risk, Conditional Value-at-Risk, and distributionally robust optimization (DRO). We develop batch and stochastic first-order optimization methods for solving TERM, provide convergence guarantees for the solvers, and show that the framework can be efficiently solved relative to common alternatives. Finally, we demonstrate that TERM can be used for a multitude of applications in machine learning, such as enforcing fairness between subgroups, mitigating the effect of outliers, and handling class imbalance. Despite the straightforward modification TERM makes to traditional ERM objectives, we find that the framework can consistently outperform ERM and deliver competitive performance with state-of-the-art, problem-specific approaches.

翻译：指数倾斜是统计、概率、信息理论和优化等领域常用的一种技术,用来创造参数分布变化。尽管在相关领域普遍存在,但倾斜并未在机器学习中广泛使用。在这项工作中,我们的目标是通过探索在风险最小化中使用倾斜来缩小这一差距。我们研究机构风险管理的简单扩展 -- -- 倾斜实验风险最小化(Term) -- -- 利用指数倾斜来灵活调整个人损失的影响。由此形成的框架有几个有用的属性:我们表明,Term可以增加或降低外部公司的影响,从而分别实现公平或稳健;具有有利于普遍化的减少差异的特性;并且可以被视为对损失尾端概率的平稳近似。我们的工作在术语和相关目标之间建立了紧密的联系,例如: " 风险值 " 、 " 目标最小值最小化 " 、 " 风险最小化(Terminalality-at-risk)和 " 分布强力优化(DRO) -- -- 我们开发了分批和随机第一级交付最优化方法,用于解决Tery(Tell)的状态,为解决者提供趋同的保证,并表明框架可以有效解决与具体选择的相对易变现的系统应用,我们可以使多级的变现变的系统变换的系统,从而可以证明。最后,我们用来的系统变换的变换的系统,可以证明。