反事实伤害 (Counterfactual harm) - 专知论文

会员服务 ·

0

可辨认的 · 统计量 · Performer · 目标函数 · Less ·

2022 年 5 月 17 日

Counterfactual harm

翻译：反事实伤害

Jonathan G. Richens,Rory Beard,Daniel H. Thompson

from arxiv, Changes to definition 3. Typos corrected and document shortened. Updated Appendices A - C

To act safely and ethically in the real world, agents must be able to reason about harm and avoid harmful actions. In this paper we develop the first statistical definition of harm and a framework for incorporating harm into algorithmic decisions. We argue that harm is fundamentally a counterfactual quantity, and show that standard machine learning algorithms that cannot perform counterfactual reasoning are guaranteed to pursue harmful policies in certain environments. To resolve this we derive a family of counterfactual objective functions that robustly mitigate for harm. We demonstrate our approach with a statistical model for identifying optimal drug doses. While standard algorithms that select doses using causal treatment effects result in significant harm, our counterfactual algorithm identifies doses that are significantly less harmful without sacrificing efficacy. Our results show that counterfactual reasoning is a key ingredient for safe and ethical AI.

翻译：为了在现实世界中安全和合乎道德地采取行动,代理人必须能够理性地理解伤害并避免有害行动。在本文件中,我们制定了第一个伤害统计定义和将伤害纳入算法决定的框架。我们争辩说,伤害从根本上说是一个反事实的数量,并表明不能进行反事实推理的标准机器学习算法可以保证在某些环境中推行有害政策。要解决这个问题,我们形成一个反事实目标功能大家庭,以强有力地减轻伤害。我们用统计模型来证明我们的方法,以确定最佳药物剂量。虽然使用因果治疗效果选择剂量的标准算法造成了重大伤害,但我们的反事实算法却确定了在不牺牲效力的情况下危害性要小得多的剂量。我们的结果表明,反事实推理是安全和道德的AI的关键要素。

1

相关内容

可辨认的

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

【因果基础】Causality Basics，36页ppt

专知会员服务

52+阅读 · 2021年8月8日

因果图，Causal Graphs，52页ppt

因果图，Causal Graphs，52页ppt

专知会员服务

250+阅读 · 2020年4月19日

UC.Berkeley CS189讲义教材:《机器学习全面指南》，185页pdf

专知会员服务

162+阅读 · 2020年1月16日

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

专知会员服务

15+阅读 · 2019年10月23日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Call for Nominations: 2022 Multimedia Prize Paper Award

Call for Nominations: 2022 Multimedia Prize Paper Award

CCF多媒体专委会

0+阅读 · 2022年2月12日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

可解释的CNN

可解释的CNN

CreateAMind

17+阅读 · 2017年10月5日

肾透明细胞癌中SETD2基因突变介导DNA错配修复系统功能受损的机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

非均质量子器件Schr？dinger-Poisson系统多尺度分析与算法研究

国家自然科学基金

0+阅读 · 2014年12月31日

Brd2调控巨噬细胞新的死亡方式—pyroptosis在动脉粥样硬化中的作用和机制

国家自然科学基金

0+阅读 · 2013年12月31日

基于纵向双RESURF概念的沟槽型SOI横向功率器件新结构及模型研究

国家自然科学基金

0+阅读 · 2013年12月31日

Perp在类风湿性关节炎外周Th17细胞存活中的作用研究

国家自然科学基金

0+阅读 · 2012年12月31日

膜蛋白Leptosphaeria rhodopsin二聚化组装的结构机制研究

国家自然科学基金

1+阅读 · 2012年12月31日

RI与Angiogenin相互作用调控PI3K/AKT/mTOR信号通路和ANG的核转位在膀胱癌发生发展中的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

COUP-TFII负调节动脉粥样硬化发生的作用和机制

国家自然科学基金

0+阅读 · 2012年12月31日

降低多视点编码复杂度的优化模型与方法研究

国家自然科学基金

0+阅读 · 2009年12月31日

TR3相互作用新蛋白机理研究

国家自然科学基金

1+阅读 · 2008年12月31日

Calibrate to Interpret

Calibrate to Interpret

Arxiv

0+阅读 · 2022年7月7日

Towards Counterfactual Image Manipulation via CLIP

Arxiv

0+阅读 · 2022年7月7日

Robust Counterfactual Explanations for Tree-Based Ensembles

Robust Counterfactual Explanations for Tree-Based Ensembles

Arxiv

0+阅读 · 2022年7月6日

De-Biasing Generative Models using Counterfactual Methods

Arxiv

0+阅读 · 2022年7月5日

Parallel and distributed Bayesian modelling for analysing high-dimensional spatio-temporal count data

Arxiv

0+阅读 · 2022年7月5日

Learning and Evaluating Graph Neural Network Explanations based on Counterfactual and Factual Reasoning

Arxiv

17+阅读 · 2022年2月17日

Multi-Agent Simulation for AI Behaviour Discovery in Operations Research

Arxiv

39+阅读 · 2021年8月30日

Counterfactual Zero-Shot and Open-Set Visual Recognition

Arxiv

12+阅读 · 2021年3月1日

Counterfactual VQA: A Cause-Effect Look at Language Bias

Arxiv

16+阅读 · 2020年12月28日

Counterfactual Explanations for Machine Learning: A Review

Arxiv

25+阅读 · 2020年10月20日

VIP会员

文章信息

相关主题

相关VIP内容

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

【因果基础】Causality Basics，36页ppt

专知会员服务

52+阅读 · 2021年8月8日

因果图，Causal Graphs，52页ppt

因果图，Causal Graphs，52页ppt

专知会员服务

250+阅读 · 2020年4月19日

UC.Berkeley CS189讲义教材:《机器学习全面指南》，185页pdf

专知会员服务

162+阅读 · 2020年1月16日

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

专知会员服务

15+阅读 · 2019年10月23日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《战区安全决策课程体系》最新244页

《"无人机航母"原型平台》

任务规划与地形分析：现代复杂环境作战导航体系

《攻击场景描述形式化模型研究》

相关资讯

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Call for Nominations: 2022 Multimedia Prize Paper Award

Call for Nominations: 2022 Multimedia Prize Paper Award

CCF多媒体专委会

0+阅读 · 2022年2月12日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

可解释的CNN

可解释的CNN

CreateAMind

17+阅读 · 2017年10月5日

相关论文

Calibrate to Interpret

Calibrate to Interpret

Arxiv

0+阅读 · 2022年7月7日

Towards Counterfactual Image Manipulation via CLIP

Arxiv

0+阅读 · 2022年7月7日

Robust Counterfactual Explanations for Tree-Based Ensembles

Robust Counterfactual Explanations for Tree-Based Ensembles

Arxiv

0+阅读 · 2022年7月6日

De-Biasing Generative Models using Counterfactual Methods

Arxiv

0+阅读 · 2022年7月5日

Parallel and distributed Bayesian modelling for analysing high-dimensional spatio-temporal count data

Arxiv

0+阅读 · 2022年7月5日

Learning and Evaluating Graph Neural Network Explanations based on Counterfactual and Factual Reasoning

Arxiv

17+阅读 · 2022年2月17日

Multi-Agent Simulation for AI Behaviour Discovery in Operations Research

Arxiv

39+阅读 · 2021年8月30日

Counterfactual Zero-Shot and Open-Set Visual Recognition

Arxiv

12+阅读 · 2021年3月1日

Counterfactual VQA: A Cause-Effect Look at Language Bias

Arxiv

16+阅读 · 2020年12月28日

Counterfactual Explanations for Machine Learning: A Review

Arxiv

25+阅读 · 2020年10月20日

相关基金

肾透明细胞癌中SETD2基因突变介导DNA错配修复系统功能受损的机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

非均质量子器件Schr？dinger-Poisson系统多尺度分析与算法研究

国家自然科学基金

0+阅读 · 2014年12月31日

Brd2调控巨噬细胞新的死亡方式—pyroptosis在动脉粥样硬化中的作用和机制

国家自然科学基金

0+阅读 · 2013年12月31日

基于纵向双RESURF概念的沟槽型SOI横向功率器件新结构及模型研究

国家自然科学基金

0+阅读 · 2013年12月31日

Perp在类风湿性关节炎外周Th17细胞存活中的作用研究

国家自然科学基金

0+阅读 · 2012年12月31日

膜蛋白Leptosphaeria rhodopsin二聚化组装的结构机制研究

国家自然科学基金

1+阅读 · 2012年12月31日

RI与Angiogenin相互作用调控PI3K/AKT/mTOR信号通路和ANG的核转位在膀胱癌发生发展中的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

COUP-TFII负调节动脉粥样硬化发生的作用和机制

国家自然科学基金

0+阅读 · 2012年12月31日

降低多视点编码复杂度的优化模型与方法研究

国家自然科学基金

0+阅读 · 2009年12月31日

TR3相互作用新蛋白机理研究

国家自然科学基金

1+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员