尽量减少经验经验风险 (Tilted Empirical Risk Minimization) - 专知论文

会员服务 ·

0

经验风险最小化 · 经验风险 · Extensibility · Facebook AI Research · 异常点 ·

2021 年 3 月 17 日

Tilted Empirical Risk Minimization

翻译：尽量减少经验经验风险

Tian Li,Ahmad Beirami,Maziar Sanjabi,Virginia Smith

from arxiv, Accepted by ICLR 2021

Empirical risk minimization (ERM) is typically designed to perform well on the average loss, which can result in estimators that are sensitive to outliers, generalize poorly, or treat subgroups unfairly. While many methods aim to address these problems individually, in this work, we explore them through a unified framework -- tilted empirical risk minimization (TERM). In particular, we show that it is possible to flexibly tune the impact of individual losses through a straightforward extension to ERM using a hyperparameter called the tilt. We provide several interpretations of the resulting framework: We show that TERM can increase or decrease the influence of outliers, respectively, to enable fairness or robustness; has variance-reduction properties that can benefit generalization; and can be viewed as a smooth approximation to a superquantile method. We develop batch and stochastic first-order optimization methods for solving TERM, and show that the problem can be efficiently solved relative to common alternatives. Finally, we demonstrate that TERM can be used for a multitude of applications, such as enforcing fairness between subgroups, mitigating the effect of outliers, and handling class imbalance. TERM is not only competitive with existing solutions tailored to these individual problems, but can also enable entirely new applications, such as simultaneously addressing outliers and promoting fairness.

翻译：风险最小化(ERM)通常是为了在平均损失上取得良好的效果,这可能导致对外部值敏感、普遍化或不公平地对待子群的估测因素,尽管许多方法旨在单独解决这些问题,但在这项工作中,我们通过一个统一框架 -- -- 倾斜的经验风险最小化(TERM)来探索这些问题。我们特别表明,通过使用称为倾斜的超参数直接扩展到机构风险管理,可以灵活地调整个人损失的影响。我们提供了对由此产生的框架的若干解释:我们表明,TER可以增加或降低外部值的影响,从而分别实现公平或稳健;具有减少差异的特性,从而有利于普遍化;并且可以被视为一种超量化方法的平稳近似。我们开发了解决长期值的批量和随机一级优化方法,并表明问题可以与普通的替代方法相对有效地解决。最后,我们证明TERM可用于多种应用,例如加强分群之间的公平性,减轻外部值的影响,以及处理类别失衡问题。TERM是不仅能够同时使现有各种解决办法具有竞争性,而且能够使新的解决办法能够同时促进这些不同的应用。

0

相关内容

经验风险最小化

经验风险最小化

经验风险最小化（ERM）是统计学习理论中的一个原则，它定义了一系列学习算法，并用于给出其性能的理论界限。经验风险最小化的策略认为，经验风险最小的模型是最优的模型。根据这一策略，按照经验风险最小化求最优模型就是求解最优化问题。

ICLR 2021杰出论文奖出炉，8篇论文上榜！

专知会员服务

26+阅读 · 2021年4月2日

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

专知会员服务

69+阅读 · 2021年3月27日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

【知识图谱@EMNLP2020】Knowledge Graphs in NLP @ EMNLP 2020

【知识图谱@EMNLP2020】Knowledge Graphs in NLP @ EMNLP 2020

专知会员服务

43+阅读 · 2020年11月22日

【实用书】数据科学基础，484页pdf，Foundations of Data Science

【实用书】数据科学基础，484页pdf，Foundations of Data Science

专知会员服务

122+阅读 · 2020年5月28日

机器学习速查手册，135页pdf

机器学习速查手册，135页pdf

专知会员服务

345+阅读 · 2020年3月15日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

53+阅读 · 2019年9月29日

鲁棒机器学习相关文献集

鲁棒机器学习相关文献集

专知

8+阅读 · 2019年8月18日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

19篇ICML2019论文摘录选读！

19篇ICML2019论文摘录选读！

专知

28+阅读 · 2019年4月28日

Call for Participation: Shared Tasks in NLPCC 2019

Call for Participation: Shared Tasks in NLPCC 2019

中国计算机学会

5+阅读 · 2019年3月22日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

已删除

将门创投

3+阅读 · 2018年4月10日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

CoCoNet: Co-Optimizing Computation and Communication for Distributed Machine Learning

Arxiv

0+阅读 · 2021年5月13日

Structural risk minimization for quantum linear classifiers

Structural risk minimization for quantum linear classifiers

Arxiv

0+阅读 · 2021年5月12日

Convergence bounds for empirical nonlinear least-squares

Arxiv

0+阅读 · 2021年5月11日

Rate-Distortion Analysis of Minimum Excess Risk in Bayesian Learning

Arxiv

0+阅读 · 2021年5月10日

On the Convergence of SGD with Biased Gradients

Arxiv

0+阅读 · 2021年5月9日

Targeted Quality Measurement of Health Care Providers

Arxiv

0+阅读 · 2021年5月6日

An Empirical Study of Training Self-Supervised Vision Transformers

Arxiv

0+阅读 · 2021年5月5日

Maximizing Marginal Fairness for Dynamic Learning to Rank

Arxiv

7+阅读 · 2021年2月18日

The Importance of Modeling Data Missingness in Algorithmic Fairness: A Causal Perspective

Arxiv

5+阅读 · 2020年12月21日

The Effect of Network Width on Stochastic Gradient Descent and Generalization: an Empirical Study

The Effect of Network Width on Stochastic Gradient Descent and Generalization: an Empirical Study

Arxiv

4+阅读 · 2019年5月9日

VIP会员

文章信息

相关主题

经验风险最小化

Facebook AI Research

相关VIP内容

ICLR 2021杰出论文奖出炉，8篇论文上榜！

专知会员服务

26+阅读 · 2021年4月2日

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

专知会员服务

69+阅读 · 2021年3月27日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

【知识图谱@EMNLP2020】Knowledge Graphs in NLP @ EMNLP 2020

【知识图谱@EMNLP2020】Knowledge Graphs in NLP @ EMNLP 2020

专知会员服务

43+阅读 · 2020年11月22日

【实用书】数据科学基础，484页pdf，Foundations of Data Science

【实用书】数据科学基础，484页pdf，Foundations of Data Science

专知会员服务

122+阅读 · 2020年5月28日

机器学习速查手册，135页pdf

机器学习速查手册，135页pdf

专知会员服务

345+阅读 · 2020年3月15日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

53+阅读 · 2019年9月29日

热门VIP内容

开通专知VIP会员享更多权益服务

《俄乌战争背景下俄罗斯的战略性海军分析（2022-2025年）》最新100页报告

【斯坦福博士论文】数据、决策与依赖：构建可信人工智能的挑战

人工智能时代背景下的未来海战

接触战中的无人机优势：美军旅级部队面临的小型无人机系统挑战与调整

相关资讯

鲁棒机器学习相关文献集

鲁棒机器学习相关文献集

专知

8+阅读 · 2019年8月18日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

19篇ICML2019论文摘录选读！

19篇ICML2019论文摘录选读！

专知

28+阅读 · 2019年4月28日

Call for Participation: Shared Tasks in NLPCC 2019

Call for Participation: Shared Tasks in NLPCC 2019

中国计算机学会

5+阅读 · 2019年3月22日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

已删除

将门创投

3+阅读 · 2018年4月10日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

相关论文

CoCoNet: Co-Optimizing Computation and Communication for Distributed Machine Learning

Arxiv

0+阅读 · 2021年5月13日

Structural risk minimization for quantum linear classifiers

Structural risk minimization for quantum linear classifiers

Arxiv

0+阅读 · 2021年5月12日

Convergence bounds for empirical nonlinear least-squares

Arxiv

0+阅读 · 2021年5月11日

Rate-Distortion Analysis of Minimum Excess Risk in Bayesian Learning

Arxiv

0+阅读 · 2021年5月10日

On the Convergence of SGD with Biased Gradients

Arxiv

0+阅读 · 2021年5月9日

Targeted Quality Measurement of Health Care Providers

Arxiv

0+阅读 · 2021年5月6日

An Empirical Study of Training Self-Supervised Vision Transformers

Arxiv

0+阅读 · 2021年5月5日

Maximizing Marginal Fairness for Dynamic Learning to Rank

Arxiv

7+阅读 · 2021年2月18日

The Importance of Modeling Data Missingness in Algorithmic Fairness: A Causal Perspective

Arxiv

5+阅读 · 2020年12月21日

The Effect of Network Width on Stochastic Gradient Descent and Generalization: an Empirical Study

The Effect of Network Width on Stochastic Gradient Descent and Generalization: an Empirical Study

Arxiv

4+阅读 · 2019年5月9日

微信扫码咨询专知VIP会员