衡量缺乏事实真相的模型比方 (Measuring Model Biases in the Absence of Ground Truth) - 专知论文

会员服务 ·

0

有偏 · MoDELS · 秩 · 样例 · 真实值 ·

2021 年 5 月 11 日

Measuring Model Biases in the Absence of Ground Truth

翻译：衡量缺乏事实真相的模型比方

Osman Aka,Ken Burke,Alex Bäuerle,Christina Greer,Margaret Mitchell

The measurement of bias in machine learning often focuses on model performance across identity subgroups (such as man and woman) with respect to groundtruth labels. However, these methods do not directly measure the associations that a model may have learned, for example between labels and identity subgroups. Further, measuring a model's bias requires a fully annotated evaluation dataset which may not be easily available in practice. We present an elegant mathematical solution that tackles both issues simultaneously, using image classification as a working example. By treating a classification model's predictions for a given image as a set of labels analogous to a bag of words, we rank the biases that a model has learned with respect to different identity labels. We use (man, woman) as a concrete example of an identity label set (although this set need not be binary), and present rankings for the labels that are most biased towards one identity or the other. We demonstrate how the statistical properties of different association metrics can lead to different rankings of the most "gender biased" labels, and conclude that normalized pointwise mutual information (nPMI) is most useful in practice. Finally, we announce an open-sourced nPMI visualization tool using TensorBoard.

翻译：测量机器学习中的偏差往往侧重于不同身份分组(如男女)在地面真实标签方面的模型性能,然而,这些方法并不直接衡量模型可能学到的关联性,例如标签和身份分组之间的关联性。此外,衡量模型的偏差需要一个完全附加说明的评价数据集,在实践中可能不容易获得。我们提出了一个优雅的数学解决方案,同时解决这两个问题,同时使用图像分类作为工作范例。我们通过处理一个分类模型,将给定图像的预测作为一组类似于一包单词的标签,对模型在不同身份标签方面学到的偏差进行排序。我们使用(男人、妇女)作为身份标签组的具体例子(尽管这一数据集不必是二进制的),并对最偏向于一种身份或另一种身份的标签进行排名。我们展示了不同协会指标的统计属性如何导致最“性别偏差”标签的不同排序,并得出结论,在实践中,统一点的相互信息(nPMI)最为有用。我们用一个公开的软件工具宣布一个开放的源代码。

0

相关内容

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

因果图，Causal Graphs，52页ppt

因果图，Causal Graphs，52页ppt

专知会员服务

250+阅读 · 2020年4月19日

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

专知会员服务

115+阅读 · 2020年4月5日

机器学习速查手册，135页pdf

机器学习速查手册，135页pdf

专知会员服务

342+阅读 · 2020年3月15日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

开源书：PyTorch深度学习起步

开源书：PyTorch深度学习起步

专知会员服务

51+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【论文笔记】通俗理解少样本文本分类 (Few-Shot Text Classification) (1)

【论文笔记】通俗理解少样本文本分类 (Few-Shot Text Classification) (1)

深度学习自然语言处理

7+阅读 · 2020年4月8日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

【SIGIR2018】五篇对抗训练文章

【SIGIR2018】五篇对抗训练文章

专知

12+阅读 · 2018年7月9日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

Quantifying Social Biases in NLP: A Generalization and Empirical Comparison of Extrinsic Fairness Metrics

Arxiv

0+阅读 · 2021年6月28日

Masked Proxy Loss For Text-Independent Speaker Verification

Arxiv

0+阅读 · 2021年6月25日

On the (Un-)Avoidability of Adversarial Examples

Arxiv

0+阅读 · 2021年6月24日

A fuzzy take on the logical issues of statistical hypothesis testing

Arxiv

0+阅读 · 2021年6月24日

A new system for evaluating brand importance: A use case from the fashion industry

Arxiv

0+阅读 · 2021年6月24日

Sunny and Dark Outside?! Improving Answer Consistency in VQA through Entailed Question Generation

Sunny and Dark Outside?! Improving Answer Consistency in VQA through Entailed Question Generation

Arxiv

3+阅读 · 2019年9月10日

What Does BERT Look At? An Analysis of BERT's Attention

Arxiv

4+阅读 · 2019年6月11日

Jointly Learning to Label Sentences and Tokens

Arxiv

3+阅读 · 2018年11月14日

Dual Memory Network Model for Biased Product Review Classification

Dual Memory Network Model for Biased Product Review Classification

Arxiv

3+阅读 · 2018年9月16日

Did the Model Understand the Question?

Arxiv

4+阅读 · 2018年5月14日

VIP会员

文章信息

相关主题

相关VIP内容

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

因果图，Causal Graphs，52页ppt

因果图，Causal Graphs，52页ppt

专知会员服务

250+阅读 · 2020年4月19日

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

专知会员服务

115+阅读 · 2020年4月5日

机器学习速查手册，135页pdf

机器学习速查手册，135页pdf

专知会员服务

342+阅读 · 2020年3月15日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

开源书：PyTorch深度学习起步

开源书：PyTorch深度学习起步

专知会员服务

51+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

扩散模型中的 Transformer：图像生成及其延展应用询问 ChatGPT

281页pdf《神经网络设计入门》

【普林斯顿博士论文】以奖励推动生成式人工智能的发展：奖励引导生成的理论与方法

中文版 | 火力支援与巡飞弹药的未来（附原文）

相关资讯

【论文笔记】通俗理解少样本文本分类 (Few-Shot Text Classification) (1)

【论文笔记】通俗理解少样本文本分类 (Few-Shot Text Classification) (1)

深度学习自然语言处理

7+阅读 · 2020年4月8日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

【SIGIR2018】五篇对抗训练文章

【SIGIR2018】五篇对抗训练文章

专知

12+阅读 · 2018年7月9日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

相关论文

Quantifying Social Biases in NLP: A Generalization and Empirical Comparison of Extrinsic Fairness Metrics

Arxiv

0+阅读 · 2021年6月28日

Masked Proxy Loss For Text-Independent Speaker Verification

Arxiv

0+阅读 · 2021年6月25日

On the (Un-)Avoidability of Adversarial Examples

Arxiv

0+阅读 · 2021年6月24日

A fuzzy take on the logical issues of statistical hypothesis testing

Arxiv

0+阅读 · 2021年6月24日

A new system for evaluating brand importance: A use case from the fashion industry

Arxiv

0+阅读 · 2021年6月24日

Sunny and Dark Outside?! Improving Answer Consistency in VQA through Entailed Question Generation

Sunny and Dark Outside?! Improving Answer Consistency in VQA through Entailed Question Generation

Arxiv

3+阅读 · 2019年9月10日

What Does BERT Look At? An Analysis of BERT's Attention

Arxiv

4+阅读 · 2019年6月11日

Jointly Learning to Label Sentences and Tokens

Arxiv

3+阅读 · 2018年11月14日

Dual Memory Network Model for Biased Product Review Classification

Dual Memory Network Model for Biased Product Review Classification

Arxiv

3+阅读 · 2018年9月16日

Did the Model Understand the Question?

Arxiv

4+阅读 · 2018年5月14日

微信扫码咨询专知VIP会员