公正感知的有监督学习数据估值 (Fairness-Aware Data Valuation for Supervised Learning) - 专知论文

会员服务 ·

0

公平性 · 度量 · 有监督学习 · 预处理 · 监督学习 ·

2023 年 3 月 29 日

Fairness-Aware Data Valuation for Supervised Learning

翻译：公正感知的有监督学习数据估值

José Pombal,Pedro Saleiro,Mário A. T. Figueiredo,Pedro Bizarro

from arxiv, ICLR 2023 Workshop Trustworthy ML

Data valuation is a ML field that studies the value of training instances towards a given predictive task. Although data bias is one of the main sources of downstream model unfairness, previous work in data valuation does not consider how training instances may influence both performance and fairness of ML models. Thus, we propose Fairness-Aware Data vauatiOn (FADO), a data valuation framework that can be used to incorporate fairness concerns into a series of ML-related tasks (e.g., data pre-processing, exploratory data analysis, active learning). We propose an entropy-based data valuation metric suited to address our two-pronged goal of maximizing both performance and fairness, which is more computationally efficient than existing metrics. We then show how FADO can be applied as the basis for unfairness mitigation pre-processing techniques. Our methods achieve promising results -- up to a 40 p.p. improvement in fairness at a less than 1 p.p. loss in performance compared to a baseline -- and promote fairness in a data-centric way, where a deeper understanding of data quality takes center stage.

翻译：数据估值是一个研究训练数据实例对于给定预测任务价值的机器学习领域。虽然数据偏差是数据模型不公平的主要来源之一，但以前的数据估值工作并没有考虑训练实例可能如何影响机器学习模型的性能和公平性。因此，我们提出了公正感知数据估值（FADO），这是一个数据估值框架，可用于将公平性问题纳入一系列与机器学习相关的任务中（例如数据预处理，探索性数据分析，主动学习）。我们提出了基于熵的数据估值度量，适用于解决我们的双重目标，即最大化性能和公平性，该度量比现有度量更具计算效率。然后，我们展示了如何将FADO应用作不公平性缓解预处理技术的基础。我们的方法取得了有前途的结果--相对于基线，在性能损失不到1个百分点的情况下，公平性提高了多达40个百分点--并以数据为中心促进公平性，其中对数据质量的深入理解处于核心地位。

0

相关内容

公平性

【ICDM 2022教程】图挖掘中的公平性:度量、算法和应用

【ICDM 2022教程】图挖掘中的公平性:度量、算法和应用

专知会员服务

28+阅读 · 2022年12月26日

【WWW2022】图上的聚类感知的监督对比学习，ClusterSCL: Cluster-Aware Supervised Contrastive Learning on Graphs

【WWW2022】图上的聚类感知的监督对比学习，ClusterSCL: Cluster-Aware Supervised Contrastive Learning on Graphs

专知会员服务

18+阅读 · 2022年3月28日

【MIT-ICLR2022】在机器学习模型中注入公平性, Injecting fairness into machine-learning models

【MIT-ICLR2022】在机器学习模型中注入公平性, Injecting fairness into machine-learning models

专知会员服务

22+阅读 · 2022年3月7日

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

专知会员服务

104+阅读 · 2022年2月10日

【KDD2021】基于因果反事实Shapley的MARL信度分配

专知会员服务

19+阅读 · 2021年7月11日

【SIGIR2020】基于知识图谱的公平感知可解释推荐，Fairness-Aware Explainable Recommendation over Knowledge Graphs

【SIGIR2020】基于知识图谱的公平感知可解释推荐，Fairness-Aware Explainable Recommendation over Knowledge Graphs

专知会员服务

47+阅读 · 2020年6月3日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

Multi-Task Learning的几篇综述文章

Multi-Task Learning的几篇综述文章

深度学习自然语言处理

15+阅读 · 2020年6月15日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

vae 相关论文表示学习 1

vae 相关论文表示学习 1

CreateAMind

12+阅读 · 2018年9月6日

【论文推荐】最新六篇生成式对抗网络（GAN）相关论文—半监督学习、对偶、交互生成对抗网络、激活、纳什均衡、tempoGAN

【论文推荐】最新六篇生成式对抗网络（GAN）相关论文—半监督学习、对偶、交互生成对抗网络、激活、纳什均衡、tempoGAN

专知

23+阅读 · 2018年2月23日

HIF-1/COMPASS调控缺氧诱导Brg1和Brm表达上调的机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

压缩感知与稀疏信号恢复

国家自然科学基金

2+阅读 · 2014年12月31日

缺氧微环境下miR-296/Snail双向负反馈环路调控胰腺癌转移的机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

平稳相依空间数据下基于经验似然的非参数统计推断

国家自然科学基金

0+阅读 · 2013年12月31日

近似稀疏高维非参与半参模型的Dantzig Selector的研究

国家自然科学基金

0+阅读 · 2012年12月31日

矿床模型向预测模型转化的不确定性因素分析

国家自然科学基金

0+阅读 · 2012年12月31日

吸入麻醉药诱导神经毒性新机制的研究：线粒体呼吸链的作用

国家自然科学基金

0+阅读 · 2011年12月31日

基于先验知识的支持向量机的最优化模型与算法研究

国家自然科学基金

0+阅读 · 2011年12月31日

P2P-Grid 环境中的智能入侵检测技术研究

国家自然科学基金

0+阅读 · 2010年12月31日

中国森林碳汇模拟与预测值不确定性的定量评估

国家自然科学基金

0+阅读 · 2009年12月31日

Towards Achieving Near-optimal Utility for Privacy-Preserving Federated Learning via Data Generation and Parameter Distortion

Arxiv

0+阅读 · 2023年5月19日

A Survey of Federated Evaluation in Federated Learning

Arxiv

0+阅读 · 2023年5月19日

Trade-offs in Static and Dynamic Evaluation of Hierarchical Queries

Arxiv

0+阅读 · 2023年5月18日

RobustFair: Adversarial Evaluation through Fairness Confusion Directed Gradient Search

Arxiv

0+阅读 · 2023年5月18日

Strong Consistency Guarantees for Clustering High-Dimensional Bipartite Graphs with the Spectral Method

Arxiv

0+阅读 · 2023年5月18日

REV: Information-Theoretic Evaluation of Free-Text Rationales

Arxiv

0+阅读 · 2023年5月18日

Sharpness & Shift-Aware Self-Supervised Learning

Arxiv

0+阅读 · 2023年5月17日

PMNet: Robust Pathloss Map Prediction via Supervised Learning

Arxiv

0+阅读 · 2023年5月16日

Graph Self-Supervised Learning: A Survey

Arxiv

15+阅读 · 2021年8月5日

Recent Advances in Large Margin Learning

Arxiv

12+阅读 · 2021年3月25日

VIP会员

文章信息

相关主题

有监督学习

相关VIP内容

【ICDM 2022教程】图挖掘中的公平性:度量、算法和应用

【ICDM 2022教程】图挖掘中的公平性:度量、算法和应用

专知会员服务

28+阅读 · 2022年12月26日

【WWW2022】图上的聚类感知的监督对比学习，ClusterSCL: Cluster-Aware Supervised Contrastive Learning on Graphs

【WWW2022】图上的聚类感知的监督对比学习，ClusterSCL: Cluster-Aware Supervised Contrastive Learning on Graphs

专知会员服务

18+阅读 · 2022年3月28日

【MIT-ICLR2022】在机器学习模型中注入公平性, Injecting fairness into machine-learning models

【MIT-ICLR2022】在机器学习模型中注入公平性, Injecting fairness into machine-learning models

专知会员服务

22+阅读 · 2022年3月7日

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

专知会员服务

104+阅读 · 2022年2月10日

【KDD2021】基于因果反事实Shapley的MARL信度分配

专知会员服务

19+阅读 · 2021年7月11日

【SIGIR2020】基于知识图谱的公平感知可解释推荐，Fairness-Aware Explainable Recommendation over Knowledge Graphs

【SIGIR2020】基于知识图谱的公平感知可解释推荐，Fairness-Aware Explainable Recommendation over Knowledge Graphs

专知会员服务

47+阅读 · 2020年6月3日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

人工智能治理的未来

模态感知的特征匹配：单一模态与跨模态技术的全面综述

无监督行人重识别研究综述

【牛津博士论文】面向神经影像应用的可扩展且可解释的空间模型

相关资讯

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

Multi-Task Learning的几篇综述文章

Multi-Task Learning的几篇综述文章

深度学习自然语言处理

15+阅读 · 2020年6月15日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

vae 相关论文表示学习 1

vae 相关论文表示学习 1

CreateAMind

12+阅读 · 2018年9月6日

【论文推荐】最新六篇生成式对抗网络（GAN）相关论文—半监督学习、对偶、交互生成对抗网络、激活、纳什均衡、tempoGAN

【论文推荐】最新六篇生成式对抗网络（GAN）相关论文—半监督学习、对偶、交互生成对抗网络、激活、纳什均衡、tempoGAN

专知

23+阅读 · 2018年2月23日

相关论文

Towards Achieving Near-optimal Utility for Privacy-Preserving Federated Learning via Data Generation and Parameter Distortion

Arxiv

0+阅读 · 2023年5月19日

A Survey of Federated Evaluation in Federated Learning

Arxiv

0+阅读 · 2023年5月19日

Trade-offs in Static and Dynamic Evaluation of Hierarchical Queries

Arxiv

0+阅读 · 2023年5月18日

RobustFair: Adversarial Evaluation through Fairness Confusion Directed Gradient Search

Arxiv

0+阅读 · 2023年5月18日

Strong Consistency Guarantees for Clustering High-Dimensional Bipartite Graphs with the Spectral Method

Arxiv

0+阅读 · 2023年5月18日

REV: Information-Theoretic Evaluation of Free-Text Rationales

Arxiv

0+阅读 · 2023年5月18日

Sharpness & Shift-Aware Self-Supervised Learning

Arxiv

0+阅读 · 2023年5月17日

PMNet: Robust Pathloss Map Prediction via Supervised Learning

Arxiv

0+阅读 · 2023年5月16日

Graph Self-Supervised Learning: A Survey

Arxiv

15+阅读 · 2021年8月5日

Recent Advances in Large Margin Learning

Arxiv

12+阅读 · 2021年3月25日

相关基金

HIF-1/COMPASS调控缺氧诱导Brg1和Brm表达上调的机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

压缩感知与稀疏信号恢复

国家自然科学基金

2+阅读 · 2014年12月31日

缺氧微环境下miR-296/Snail双向负反馈环路调控胰腺癌转移的机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

平稳相依空间数据下基于经验似然的非参数统计推断

国家自然科学基金

0+阅读 · 2013年12月31日

近似稀疏高维非参与半参模型的Dantzig Selector的研究

国家自然科学基金

0+阅读 · 2012年12月31日

矿床模型向预测模型转化的不确定性因素分析

国家自然科学基金

0+阅读 · 2012年12月31日

吸入麻醉药诱导神经毒性新机制的研究：线粒体呼吸链的作用

国家自然科学基金

0+阅读 · 2011年12月31日

基于先验知识的支持向量机的最优化模型与算法研究

国家自然科学基金

0+阅读 · 2011年12月31日

P2P-Grid 环境中的智能入侵检测技术研究

国家自然科学基金

0+阅读 · 2010年12月31日

中国森林碳汇模拟与预测值不确定性的定量评估

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员