复杂:RAI数据集与算法公平基准之间的混乱关系 (It's COMPASlicated: The Messy Relationship between RAI Datasets and Algorithmic Fairness Benchmarks) - 专知论文

会员服务 ·

0

Facebook AI Research · 数据集 · Performer · CASES · Prompt ·

2021 年 6 月 10 日

It's COMPASlicated: The Messy Relationship between RAI Datasets and Algorithmic Fairness Benchmarks

翻译：复杂:RAI数据集与算法公平基准之间的混乱关系

Michelle Bao,Angela Zhou,Samantha Zottola,Brian Brubach,Sarah Desmarais,Aaron Horowitz,Kristian Lum,Suresh Venkatasubramanian

Risk assessment instrument (RAI) datasets, particularly ProPublica's COMPAS dataset, are commonly used in algorithmic fairness papers due to benchmarking practices of comparing algorithms on datasets used in prior work. In many cases, this data is used as a benchmark to demonstrate good performance without accounting for the complexities of criminal justice (CJ) processes. We show that pretrial RAI datasets contain numerous measurement biases and errors inherent to CJ pretrial evidence and due to disparities in discretion and deployment, are limited in making claims about real-world outcomes, making the datasets a poor fit for benchmarking under assumptions of ground truth and real-world impact. Conventional practices of simply replicating previous data experiments may implicitly inherit or edify normative positions without explicitly interrogating assumptions. With context of how interdisciplinary fields have engaged in CJ research, algorithmic fairness practices are misaligned for meaningful contribution in the context of CJ, and would benefit from transparent engagement with normative considerations and values related to fairness, justice, and equality. These factors prompt questions about whether benchmarks for intrinsically socio-technical systems like the CJ system can exist in a beneficial and ethical way.

翻译：风险评估工具(RAI)数据集,特别是ProPublica的COMPAS数据集,通常用于算法公平性文件,这是因为比较以往工作中使用的数据集算法的基准做法,在许多情况下,这些数据被用作一种基准,以证明良好业绩,而不考虑刑事司法程序的复杂性。我们表明,审前风险评估工具数据集包含许多衡量偏见和误差,特别是ProPublica的COMPAS数据集,在对真实世界结果进行索赔时受到限制,使数据集在地面事实和真实世界影响的假设下不适合基准设定。简单复制以往数据试验的常规做法可能隐含地继承或调整规范立场,而没有明确询问假设。在跨学科领域如何参与CJ研究的背景下,算法公平做法在CJ工作中作出有意义的贡献方面有误,而且如果透明地参与与公平、正义和平等有关的规范性考虑和价值观,则会受益。这些因素引发了这样一个问题,即像CJ系统这样的内在社会技术系统是否能够以有益和合乎道德的方式存在基准。

0

相关内容

Facebook AI Research

Facebook AI Research

Facebook AI Research

【Cell】神经算法推理，Neural algorithmic reasoning

【Cell】神经算法推理，Neural algorithmic reasoning

专知会员服务

29+阅读 · 2021年7月16日

可信机器学习的公平性综述

可信机器学习的公平性综述

专知会员服务

69+阅读 · 2021年2月23日

【干货书】强化学习算法，98页pdf综合讲解人工智能和机器学习

【干货书】强化学习算法，98页pdf综合讲解人工智能和机器学习

专知会员服务

66+阅读 · 2021年2月21日

2020数据工程师成长路线图

专知会员服务

41+阅读 · 2020年9月6日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

Yoshua Bengio，使算法知道“为什么”

Yoshua Bengio，使算法知道“为什么”

专知会员服务

8+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

已删除

将门创投

5+阅读 · 2019年10月29日

Call for Participation: Shared Tasks in NLPCC 2019

Call for Participation: Shared Tasks in NLPCC 2019

中国计算机学会

5+阅读 · 2019年3月22日

【TED】生命中的每一年的智慧

【TED】生命中的每一年的智慧

英语演讲视频每日一推

10+阅读 · 2019年1月29日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

CCF C类 | IJCNN 2019 Special Section : 信息论与深度学习

CCF C类 | IJCNN 2019 Special Section : 信息论与深度学习

Call4Papers

5+阅读 · 2018年12月7日

计算机 | CCF推荐会议信息10条

计算机 | CCF推荐会议信息10条

Call4Papers

5+阅读 · 2018年10月18日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

计算机视觉近一年进展综述

计算机视觉近一年进展综述

机器学习研究会

9+阅读 · 2017年11月25日

【推荐】Kaggle机器学习数据集推荐

【推荐】Kaggle机器学习数据集推荐

机器学习研究会

8+阅读 · 2017年11月19日

Benchmarking machine learning models on multi-centre eICU critical care dataset

Benchmarking machine learning models on multi-centre eICU critical care dataset

Arxiv

0+阅读 · 2021年8月5日

Terabyte-scale supervised 3D training and benchmarking dataset of the mouse kidney

Arxiv

0+阅读 · 2021年8月4日

Computationally Efficient Optimization of Plackett-Luce Ranking Models for Relevance and Fairness

Arxiv

5+阅读 · 2021年5月3日

Federated Learning with Fair Averaging

Arxiv

7+阅读 · 2021年4月30日

Maximizing Marginal Fairness for Dynamic Learning to Rank

Arxiv

7+阅读 · 2021年2月18日

The Importance of Modeling Data Missingness in Algorithmic Fairness: A Causal Perspective

Arxiv

5+阅读 · 2020年12月21日

A Benchmarking Study of Embedding-based Entity Alignment for Knowledge Graphs

Arxiv

3+阅读 · 2020年7月20日

Mining Disinformation and Fake News: Concepts, Methods, and Recent Advancements

Mining Disinformation and Fake News: Concepts, Methods, and Recent Advancements

Arxiv

16+阅读 · 2020年1月2日

Equity of Attention: Amortizing Individual Fairness in Rankings

Arxiv

4+阅读 · 2018年5月4日

TrackingNet: A Large-Scale Dataset and Benchmark for Object Tracking in the Wild

Arxiv

7+阅读 · 2018年3月28日

VIP会员

文章信息

相关主题

Facebook AI Research

相关VIP内容

【Cell】神经算法推理，Neural algorithmic reasoning

【Cell】神经算法推理，Neural algorithmic reasoning

专知会员服务

29+阅读 · 2021年7月16日

可信机器学习的公平性综述

可信机器学习的公平性综述

专知会员服务

69+阅读 · 2021年2月23日

【干货书】强化学习算法，98页pdf综合讲解人工智能和机器学习

【干货书】强化学习算法，98页pdf综合讲解人工智能和机器学习

专知会员服务

66+阅读 · 2021年2月21日

2020数据工程师成长路线图

专知会员服务

41+阅读 · 2020年9月6日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

Yoshua Bengio，使算法知道“为什么”

Yoshua Bengio，使算法知道“为什么”

专知会员服务

8+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

NeurIPS 2025 | 自动化所新作速览（一）

大型语言模型（LLM）赋能的知识图谱构建：综述

NeurIPS 2025 | 自动化所新作速览（二）

领域特定文本分类中的预训练语言模型新进展：系统综述

相关资讯

已删除

将门创投

5+阅读 · 2019年10月29日

Call for Participation: Shared Tasks in NLPCC 2019

Call for Participation: Shared Tasks in NLPCC 2019

中国计算机学会

5+阅读 · 2019年3月22日

【TED】生命中的每一年的智慧

【TED】生命中的每一年的智慧

英语演讲视频每日一推

10+阅读 · 2019年1月29日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

CCF C类 | IJCNN 2019 Special Section : 信息论与深度学习

CCF C类 | IJCNN 2019 Special Section : 信息论与深度学习

Call4Papers

5+阅读 · 2018年12月7日

计算机 | CCF推荐会议信息10条

计算机 | CCF推荐会议信息10条

Call4Papers

5+阅读 · 2018年10月18日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

计算机视觉近一年进展综述

计算机视觉近一年进展综述

机器学习研究会

9+阅读 · 2017年11月25日

【推荐】Kaggle机器学习数据集推荐

【推荐】Kaggle机器学习数据集推荐

机器学习研究会

8+阅读 · 2017年11月19日

相关论文

Benchmarking machine learning models on multi-centre eICU critical care dataset

Benchmarking machine learning models on multi-centre eICU critical care dataset

Arxiv

0+阅读 · 2021年8月5日

Terabyte-scale supervised 3D training and benchmarking dataset of the mouse kidney

Arxiv

0+阅读 · 2021年8月4日

Computationally Efficient Optimization of Plackett-Luce Ranking Models for Relevance and Fairness

Arxiv

5+阅读 · 2021年5月3日

Federated Learning with Fair Averaging

Arxiv

7+阅读 · 2021年4月30日

Maximizing Marginal Fairness for Dynamic Learning to Rank

Arxiv

7+阅读 · 2021年2月18日

The Importance of Modeling Data Missingness in Algorithmic Fairness: A Causal Perspective

Arxiv

5+阅读 · 2020年12月21日

A Benchmarking Study of Embedding-based Entity Alignment for Knowledge Graphs

Arxiv

3+阅读 · 2020年7月20日

Mining Disinformation and Fake News: Concepts, Methods, and Recent Advancements

Mining Disinformation and Fake News: Concepts, Methods, and Recent Advancements

Arxiv

16+阅读 · 2020年1月2日

Equity of Attention: Amortizing Individual Fairness in Rankings

Arxiv

4+阅读 · 2018年5月4日

TrackingNet: A Large-Scale Dataset and Benchmark for Object Tracking in the Wild

Arxiv

7+阅读 · 2018年3月28日

微信扫码咨询专知VIP会员