谁决定AI是否公平? (Who Decides if AI is Fair? The Labels Problem in Algorithmic Auditing) - 专知论文

会员服务 ·

0

标注 · 真实值 · Facebook AI Research · Performer · 逼真度 ·

2021 年 11 月 16 日

Who Decides if AI is Fair? The Labels Problem in Algorithmic Auditing

翻译：谁决定AI是否公平?

Abhilash Mishra,Yash Gorana

from arxiv, Accepted to NeurIPS 2021 Data-Centric AI Workshop (https://datacentricai.org/)

Labelled "ground truth" datasets are routinely used to evaluate and audit AI algorithms applied in high-stakes settings. However, there do not exist widely accepted benchmarks for the quality of labels in these datasets. We provide empirical evidence that quality of labels can significantly distort the results of algorithmic audits in real-world settings. Using data annotators typically hired by AI firms in India, we show that fidelity of the ground truth data can lead to spurious differences in performance of ASRs between urban and rural populations. After a rigorous, albeit expensive, label cleaning process, these disparities between groups disappear. Our findings highlight how trade-offs between label quality and data annotation costs can complicate algorithmic audits in practice. They also emphasize the need for development of consensus-driven, widely accepted benchmarks for label quality.

翻译：“地面真相”数据集被例行用于评价和审计在高取量环境中应用的AI算法,然而,对于这些数据集中的标签质量,还没有得到广泛接受的基准。我们提供了经验证据,证明标签质量可以大大扭曲现实世界环境中的算法审计结果。使用大赦国际公司在印度通常雇用的数据批注员,我们表明地面真相数据的忠实性可能导致城市和农村人口在ASR绩效上的虚假差异。经过严格但昂贵的标签清理程序,这些群体之间的差别消失了。我们的调查结果突出表明,标签质量和数据批注成本之间的取舍会如何使实际的算法审计复杂化。他们还强调,需要为标签质量制定协商一致的、普遍接受的基准。

0

相关内容

【经典书】算法博弈论，775页pdf，Algorithmic Game Theory

【经典书】算法博弈论，775页pdf，Algorithmic Game Theory

专知会员服务

154+阅读 · 2021年5月9日

【快讯】ICML 2020论文出炉，1088篇上榜，你的paper中了吗？

【快讯】ICML 2020论文出炉，1088篇上榜，你的paper中了吗？

专知会员服务

52+阅读 · 2020年6月1日

手写实现李航《统计学习方法》书中全部算法

专知会员服务

142+阅读 · 2020年5月19日

人工智能如何用于抵抗COVID-19？Mila这份《AI against COVID-19 》PPT

专知会员服务

48+阅读 · 2020年5月17日

【医学图像处理中的因果性】52页ppt，Causality Matters in Medical Imaging

【医学图像处理中的因果性】52页ppt，Causality Matters in Medical Imaging

专知会员服务

60+阅读 · 2020年3月14日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【《图解深度学习》电子书与代码，830页pdf】’Deep Learning Illustrated (2019)' by Deep Learning Study Group GitHub

【《图解深度学习》电子书与代码，830页pdf】’Deep Learning Illustrated (2019)' by Deep Learning Study Group GitHub

专知会员服务

152+阅读 · 2019年1月1日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Call for Participation: Shared Tasks in NLPCC 2019

Call for Participation: Shared Tasks in NLPCC 2019

中国计算机学会

5+阅读 · 2019年3月22日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

【论文推荐】最新五篇命名实体识别相关论文—深度主动学习、Lattice LSTM、混合马尔可夫CRF

【论文推荐】最新五篇命名实体识别相关论文—深度主动学习、Lattice LSTM、混合马尔可夫CRF

专知

26+阅读 · 2018年5月22日

【论文推荐】最新八篇目标跟踪相关论文—自适应相关滤波、因果关系图模型、TrackingNet、ClickBAIT、图像矩模型

【论文推荐】最新八篇目标跟踪相关论文—自适应相关滤波、因果关系图模型、TrackingNet、ClickBAIT、图像矩模型

专知

4+阅读 · 2018年4月18日

人工智能 | 国际会议截稿信息9条

人工智能 | 国际会议截稿信息9条

Call4Papers

4+阅读 · 2018年3月13日

【论文推荐】最新5篇深度学习相关论文推介——感知度量、图像检索、联合视盘和视杯分割、谱聚类、MPI并行

【论文推荐】最新5篇深度学习相关论文推介——感知度量、图像检索、联合视盘和视杯分割、谱聚类、MPI并行

专知

6+阅读 · 2018年1月15日

【推荐】SVM实例教程

【推荐】SVM实例教程

机器学习研究会

17+阅读 · 2017年8月26日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

Verilay: A Verifiable Proof of Stake Chain Relay

Arxiv

0+阅读 · 2022年1月21日

To SMOTE, or not to SMOTE?

Arxiv

0+阅读 · 2022年1月21日

"Project smells" -- Experiences in Analysing the Software Quality of ML Projects with mllint

Arxiv

0+阅读 · 2022年1月20日

Agafonov's Proof of Agafonov's Theorem: A Modern Account and New Insights

Arxiv

0+阅读 · 2022年1月20日

Investigating underdiagnosis of AI algorithms in the presence of multiple sources of dataset bias

Arxiv

0+阅读 · 2022年1月19日

Enhancing the Security & Privacy of Wearable Brain-Computer Interfaces

Arxiv

0+阅读 · 2022年1月19日

The Importance of Modeling Data Missingness in Algorithmic Fairness: A Causal Perspective

Arxiv

5+阅读 · 2020年12月21日

Do We Really Need to Access the Source Data? Source Hypothesis Transfer for Unsupervised Domain Adaptation

Arxiv

4+阅读 · 2020年8月6日

Early-Learning Regularization Prevents Memorization of Noisy Labels

Early-Learning Regularization Prevents Memorization of Noisy Labels

Arxiv

3+阅读 · 2020年6月30日

The Measure of Intelligence

The Measure of Intelligence

Arxiv

7+阅读 · 2019年11月5日

VIP会员

文章信息

相关主题

Facebook AI Research

相关VIP内容

【经典书】算法博弈论，775页pdf，Algorithmic Game Theory

【经典书】算法博弈论，775页pdf，Algorithmic Game Theory

专知会员服务

154+阅读 · 2021年5月9日

【快讯】ICML 2020论文出炉，1088篇上榜，你的paper中了吗？

【快讯】ICML 2020论文出炉，1088篇上榜，你的paper中了吗？

专知会员服务

52+阅读 · 2020年6月1日

手写实现李航《统计学习方法》书中全部算法

专知会员服务

142+阅读 · 2020年5月19日

人工智能如何用于抵抗COVID-19？Mila这份《AI against COVID-19 》PPT

专知会员服务

48+阅读 · 2020年5月17日

【医学图像处理中的因果性】52页ppt，Causality Matters in Medical Imaging

【医学图像处理中的因果性】52页ppt，Causality Matters in Medical Imaging

专知会员服务

60+阅读 · 2020年3月14日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【《图解深度学习》电子书与代码，830页pdf】’Deep Learning Illustrated (2019)' by Deep Learning Study Group GitHub

【《图解深度学习》电子书与代码，830页pdf】’Deep Learning Illustrated (2019)' by Deep Learning Study Group GitHub

专知会员服务

152+阅读 · 2019年1月1日

热门VIP内容

开通专知VIP会员享更多权益服务

《物联网（IoT）中的无人机通信高效控制》135页

《在GNSS信号降级环境中利用共识实现无人机集群稳健协调》

中程单向攻击无人机的战略意义：俄乌战争启示

《面向无人机集群的避障动态传感器覆盖算法》最新38页

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Call for Participation: Shared Tasks in NLPCC 2019

Call for Participation: Shared Tasks in NLPCC 2019

中国计算机学会

5+阅读 · 2019年3月22日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

【论文推荐】最新五篇命名实体识别相关论文—深度主动学习、Lattice LSTM、混合马尔可夫CRF

【论文推荐】最新五篇命名实体识别相关论文—深度主动学习、Lattice LSTM、混合马尔可夫CRF

专知

26+阅读 · 2018年5月22日

【论文推荐】最新八篇目标跟踪相关论文—自适应相关滤波、因果关系图模型、TrackingNet、ClickBAIT、图像矩模型

【论文推荐】最新八篇目标跟踪相关论文—自适应相关滤波、因果关系图模型、TrackingNet、ClickBAIT、图像矩模型

专知

4+阅读 · 2018年4月18日

人工智能 | 国际会议截稿信息9条

人工智能 | 国际会议截稿信息9条

Call4Papers

4+阅读 · 2018年3月13日

【论文推荐】最新5篇深度学习相关论文推介——感知度量、图像检索、联合视盘和视杯分割、谱聚类、MPI并行

【论文推荐】最新5篇深度学习相关论文推介——感知度量、图像检索、联合视盘和视杯分割、谱聚类、MPI并行

专知

6+阅读 · 2018年1月15日

【推荐】SVM实例教程

【推荐】SVM实例教程

机器学习研究会

17+阅读 · 2017年8月26日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

相关论文

Verilay: A Verifiable Proof of Stake Chain Relay

Arxiv

0+阅读 · 2022年1月21日

To SMOTE, or not to SMOTE?

Arxiv

0+阅读 · 2022年1月21日

"Project smells" -- Experiences in Analysing the Software Quality of ML Projects with mllint

Arxiv

0+阅读 · 2022年1月20日

Agafonov's Proof of Agafonov's Theorem: A Modern Account and New Insights

Arxiv

0+阅读 · 2022年1月20日

Investigating underdiagnosis of AI algorithms in the presence of multiple sources of dataset bias

Arxiv

0+阅读 · 2022年1月19日

Enhancing the Security & Privacy of Wearable Brain-Computer Interfaces

Arxiv

0+阅读 · 2022年1月19日

The Importance of Modeling Data Missingness in Algorithmic Fairness: A Causal Perspective

Arxiv

5+阅读 · 2020年12月21日

Do We Really Need to Access the Source Data? Source Hypothesis Transfer for Unsupervised Domain Adaptation

Arxiv

4+阅读 · 2020年8月6日

Early-Learning Regularization Prevents Memorization of Noisy Labels

Early-Learning Regularization Prevents Memorization of Noisy Labels

Arxiv

3+阅读 · 2020年6月30日

The Measure of Intelligence

The Measure of Intelligence

Arxiv

7+阅读 · 2019年11月5日

微信扫码咨询专知VIP会员