首先不造成伤害:反事实客观功能用于安全和道德的AI (First do no harm: counterfactual objective functions for safe & ethical AI) - 专知论文

会员服务 ·

0

可辨认的 · 目标函数 · 统计量 · 泛函 · 优化器 ·

2022 年 4 月 27 日

First do no harm: counterfactual objective functions for safe & ethical AI

翻译：首先不造成伤害:反事实客观功能用于安全和道德的AI

Jonathan G. Richens,Rory Beard,Daniel H. Thompson

To act safely and ethically in the real world, agents must be able to reason about harm and avoid harmful actions. In this paper we develop the first statistical definition of harm and a framework for factoring harm into algorithmic decisions. We argue that harm is fundamentally a counterfactual quantity, and show that standard machine learning algorithms are guaranteed to pursue harmful policies in certain environments. To resolve this, we derive a family of counterfactual objective functions that robustly mitigate for harm. We demonstrate our approach with a statistical model for identifying optimal drug doses. While identifying optimal doses using the causal treatment effect results in harmful treatment decisions, our counterfactual algorithm identifies doses that are far less harmful without sacrificing efficacy. Our results show that counterfactual reasoning is a key ingredient for safe and ethical AI.

翻译：为了在现实世界中安全、合乎道德地采取行动,代理人必须能够理性地理解伤害并避免有害行动。在本文件中,我们制定了第一个伤害统计定义和将伤害因素纳入算法决定的框架。我们争辩说,伤害从根本上说是一种反事实的数量,并表明标准机器学习算法保证在某些环境中执行有害政策。为了解决这个问题,我们形成了一个反事实目标功能的大家庭,有力地减轻伤害。我们用统计模型展示了我们确定最佳药物剂量的方法。在确定因果治疗效果导致有害治疗决定的最佳剂量的同时,我们的反事实算法确定了在不牺牲效力的情况下危害性要小得多的剂量。我们的结果显示,反事实推理是安全和道德的AI的关键要素。

0

相关内容

可辨认的

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

开源书：PyTorch深度学习起步

开源书：PyTorch深度学习起步

专知会员服务

51+阅读 · 2019年10月11日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

TMS1基因响应高温胁迫和ER Stress的分子机制

国家自然科学基金

0+阅读 · 2014年12月31日

基于量子点酶辅助RCA放大荧光偏振分析用于miRNA高灵敏多重检测

国家自然科学基金

0+阅读 · 2014年12月31日

基于能量传递的宽带太阳光谱调制近红外下转换CaNb2O6发光薄膜研究

国家自然科学基金

0+阅读 · 2014年12月31日

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

多壳层上转换纳米晶与表面等离子体共振耦合效应在染料敏化太阳能电池中的应用研究

国家自然科学基金

0+阅读 · 2013年12月31日

用于生物检测与成像的AIE型红光纳米材料

国家自然科学基金

0+阅读 · 2012年12月31日

低气压对高海拔湖泊甲烷气泡排放的驱动力及机理

国家自然科学基金

0+阅读 · 2012年12月31日

STIM1突变与核浆钙信号调控

国家自然科学基金

0+阅读 · 2012年12月31日

NV中心荧光共振能量转移研究

国家自然科学基金

0+阅读 · 2012年12月31日

柴油机尾气排放NOx-PM-HC-CO污染物耦合催化去除的研究

国家自然科学基金

0+阅读 · 2008年12月31日

Self-Assessment for Single-Object Tracking in Clutter Using Subjective Logic

Arxiv

0+阅读 · 2022年6月15日

When adversarial attacks become interpretable counterfactual explanations

Arxiv

0+阅读 · 2022年6月14日

Estimating Causal Effects Under Image Confounding Bias with an Application to Poverty in Africa

Arxiv

0+阅读 · 2022年6月13日

Adversarial Models Towards Data Availability and Integrity of Distributed State Estimation for Industrial IoT-Based Smart Grid

Arxiv

0+阅读 · 2022年6月13日

On the (In)Tractability of Reinforcement Learning for LTL Objectives

Arxiv

0+阅读 · 2022年6月13日

An Algorithm for Exact Numerical Age-of-Information Evaluation in Multi-Agent Systems

Arxiv

1+阅读 · 2022年6月11日

Explaining Image Classifiers Using Contrastive Counterfactuals in Generative Latent Spaces

Arxiv

0+阅读 · 2022年6月10日

Adversarial Counterfactual Environment Model Learning

Arxiv

0+阅读 · 2022年6月10日

Projected State-action Balancing Weights for Offline Reinforcement Learning

Arxiv

0+阅读 · 2022年6月9日

Counterfactual Explanations for Machine Learning: A Review

Arxiv

25+阅读 · 2020年10月20日

VIP会员

文章信息

相关主题

相关VIP内容

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

开源书：PyTorch深度学习起步

开源书：PyTorch深度学习起步

专知会员服务

51+阅读 · 2019年10月11日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【CMU博士论文】数据驱动决策中的激励、信息与不确定性

DGP双粒度提示框架：图增强大模型助力欺诈检测

【ICCV2025】ESSENTIAL：用于视频类增量学习的情景记忆与语义记忆整合

唯快不破：大型语言模型高效架构综述

相关资讯

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

相关论文

Self-Assessment for Single-Object Tracking in Clutter Using Subjective Logic

Arxiv

0+阅读 · 2022年6月15日

When adversarial attacks become interpretable counterfactual explanations

Arxiv

0+阅读 · 2022年6月14日

Estimating Causal Effects Under Image Confounding Bias with an Application to Poverty in Africa

Arxiv

0+阅读 · 2022年6月13日

Adversarial Models Towards Data Availability and Integrity of Distributed State Estimation for Industrial IoT-Based Smart Grid

Arxiv

0+阅读 · 2022年6月13日

On the (In)Tractability of Reinforcement Learning for LTL Objectives

Arxiv

0+阅读 · 2022年6月13日

An Algorithm for Exact Numerical Age-of-Information Evaluation in Multi-Agent Systems

Arxiv

1+阅读 · 2022年6月11日

Explaining Image Classifiers Using Contrastive Counterfactuals in Generative Latent Spaces

Arxiv

0+阅读 · 2022年6月10日

Adversarial Counterfactual Environment Model Learning

Arxiv

0+阅读 · 2022年6月10日

Projected State-action Balancing Weights for Offline Reinforcement Learning

Arxiv

0+阅读 · 2022年6月9日

Counterfactual Explanations for Machine Learning: A Review

Arxiv

25+阅读 · 2020年10月20日

相关基金

TMS1基因响应高温胁迫和ER Stress的分子机制

国家自然科学基金

0+阅读 · 2014年12月31日

基于量子点酶辅助RCA放大荧光偏振分析用于miRNA高灵敏多重检测

国家自然科学基金

0+阅读 · 2014年12月31日

基于能量传递的宽带太阳光谱调制近红外下转换CaNb2O6发光薄膜研究

国家自然科学基金

0+阅读 · 2014年12月31日

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

多壳层上转换纳米晶与表面等离子体共振耦合效应在染料敏化太阳能电池中的应用研究

国家自然科学基金

0+阅读 · 2013年12月31日

用于生物检测与成像的AIE型红光纳米材料

国家自然科学基金

0+阅读 · 2012年12月31日

低气压对高海拔湖泊甲烷气泡排放的驱动力及机理

国家自然科学基金

0+阅读 · 2012年12月31日

STIM1突变与核浆钙信号调控

国家自然科学基金

0+阅读 · 2012年12月31日

NV中心荧光共振能量转移研究

国家自然科学基金

0+阅读 · 2012年12月31日

柴油机尾气排放NOx-PM-HC-CO污染物耦合催化去除的研究

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员