消除有偏差的数据,以提高公平和准确性 (Removing biased data to improve fairness and accuracy) - 专知论文

会员服务 ·

0

模型评估 · 有偏 · Facebook AI Research · Automator · Machine Learning ·

2021 年 2 月 5 日

Removing biased data to improve fairness and accuracy

翻译：消除有偏差的数据,以提高公平和准确性

Sahil Verma,Michael Ernst,Rene Just

from arxiv, 16 pages, 5 Figures, 8 Tables

Machine learning systems are often trained using data collected from historical decisions. If past decisions were biased, then automated systems that learn from historical data will also be biased. We propose a black-box approach to identify and remove biased training data. Machine learning models trained on such debiased data (a subset of the original training data) have low individual discrimination, often 0%. These models also have greater accuracy and lower statistical disparity than models trained on the full historical data. We evaluated our methodology in experiments using 6 real-world datasets. Our approach outperformed seven previous approaches in terms of individual discrimination and accuracy.

翻译：机器学习系统往往利用从历史决定中收集的数据进行培训。如果过去的决定有偏差,那么从历史数据中学习的自动化系统也会有偏差。我们建议采用黑盒方法来识别和删除有偏差的培训数据。用这种有偏差的数据(原始培训数据的一个子集)培训的机器学习模型的个人歧视程度较低,通常为零。这些模型的准确性和统计差异也比用全部历史数据培训的模型要大。我们用6个真实世界数据集评估了我们的实验方法。我们的方法在个人歧视和准确性方面优于先前的7种方法。

0

相关内容

模型评估

机器学习系统设计系统评估标准

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

【MIT】反偏差对比学习，Debiased Contrastive Learning

【MIT】反偏差对比学习，Debiased Contrastive Learning

专知会员服务

91+阅读 · 2020年7月4日

商业数据分析，39页ppt

商业数据分析，39页ppt

专知会员服务

165+阅读 · 2020年6月2日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【AAAI Tutorials 2019】联合学习：机器学习中的用户隐私，数据安全性和机密性（Federated Learning: User Privacy, Data Security and Confidentiality in Machine Learning）

【AAAI Tutorials 2019】联合学习：机器学习中的用户隐私，数据安全性和机密性（Federated Learning: User Privacy, Data Security and Confidentiality in Machine Learning）

专知会员服务

15+阅读 · 2019年11月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Yoshua Bengio，使算法知道“为什么”

Yoshua Bengio，使算法知道“为什么”

专知会员服务

8+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

已删除

将门创投

4+阅读 · 2017年12月5日

Improving robustness against common corruptions with frequency biased models

Arxiv

0+阅读 · 2021年3月30日

Improving the Fairness of Deep Generative Models without Retraining

Arxiv

0+阅读 · 2021年3月29日

On the Privacy Risks of Algorithmic Fairness

Arxiv

0+阅读 · 2021年3月28日

Inapplicability of the TVOR method to USHMM Data Outlier Identification

Arxiv

0+阅读 · 2021年3月27日

Fairness and Robustness of Contrasting Explanations

Arxiv

0+阅读 · 2021年3月26日

Tilted Cross Entropy (TCE): Promoting Fairness in Semantic Segmentation

Arxiv

0+阅读 · 2021年3月25日

The Importance of Modeling Data Missingness in Algorithmic Fairness: A Causal Perspective

Arxiv

5+阅读 · 2020年12月21日

FairRec: Two-Sided Fairness for Personalized Recommendations in Two-Sided Platforms

Arxiv

6+阅读 · 2020年2月25日

Two-phase Hair Image Synthesis by Self-Enhancing Generative Model

Two-phase Hair Image Synthesis by Self-Enhancing Generative Model

Arxiv

3+阅读 · 2019年2月28日

Improving the Transformer Translation Model with Document-Level Context

Arxiv

4+阅读 · 2018年10月8日

VIP会员

文章信息

相关主题

Facebook AI Research

Machine Learning

相关VIP内容

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

【MIT】反偏差对比学习，Debiased Contrastive Learning

【MIT】反偏差对比学习，Debiased Contrastive Learning

专知会员服务

91+阅读 · 2020年7月4日

商业数据分析，39页ppt

商业数据分析，39页ppt

专知会员服务

165+阅读 · 2020年6月2日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【AAAI Tutorials 2019】联合学习：机器学习中的用户隐私，数据安全性和机密性（Federated Learning: User Privacy, Data Security and Confidentiality in Machine Learning）

【AAAI Tutorials 2019】联合学习：机器学习中的用户隐私，数据安全性和机密性（Federated Learning: User Privacy, Data Security and Confidentiality in Machine Learning）

专知会员服务

15+阅读 · 2019年11月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Yoshua Bengio，使算法知道“为什么”

Yoshua Bengio，使算法知道“为什么”

专知会员服务

8+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《无人机战争时代的战时法：大国竞争中的区分原则、相称性原则与行动建议》最新75页

《构建强健军事力量的设计挑战：提升海军兵力支持系统效能的多分辨率建模方法》69页

正视无人机心理战：恐惧效应与战略反思

《精确反蜂群防御系统：三维运动探测与定向空爆拦截技术融合》最新24页

相关资讯

已删除

将门创投

4+阅读 · 2017年12月5日

相关论文

Improving robustness against common corruptions with frequency biased models

Arxiv

0+阅读 · 2021年3月30日

Improving the Fairness of Deep Generative Models without Retraining

Arxiv

0+阅读 · 2021年3月29日

On the Privacy Risks of Algorithmic Fairness

Arxiv

0+阅读 · 2021年3月28日

Inapplicability of the TVOR method to USHMM Data Outlier Identification

Arxiv

0+阅读 · 2021年3月27日

Fairness and Robustness of Contrasting Explanations

Arxiv

0+阅读 · 2021年3月26日

Tilted Cross Entropy (TCE): Promoting Fairness in Semantic Segmentation

Arxiv

0+阅读 · 2021年3月25日

The Importance of Modeling Data Missingness in Algorithmic Fairness: A Causal Perspective

Arxiv

5+阅读 · 2020年12月21日

FairRec: Two-Sided Fairness for Personalized Recommendations in Two-Sided Platforms

Arxiv

6+阅读 · 2020年2月25日

Two-phase Hair Image Synthesis by Self-Enhancing Generative Model

Two-phase Hair Image Synthesis by Self-Enhancing Generative Model

Arxiv

3+阅读 · 2019年2月28日

Improving the Transformer Translation Model with Document-Level Context

Arxiv

4+阅读 · 2018年10月8日

微信扫码咨询专知VIP会员