人工智能/超级智能的阿喀琉斯之踵：决策理论对抗的挑战 (Achilles Heels for AGI/ASI via Decision Theoretic Adversaries) - 专知论文

会员服务 ·

0

超级人工智能 · 人工智能 · 对抗 · 潜在 · 系统 ·

2023 年 4 月 2 日

Achilles Heels for AGI/ASI via Decision Theoretic Adversaries

翻译：人工智能/超级智能的阿喀琉斯之踵：决策理论对抗的挑战

from arxiv, Contact info for author at stephencasper.com

As progress in AI continues to advance, it is important to know how advanced systems will make choices and in what ways they may fail. Machines can already outsmart humans in some domains, and understanding how to safely build ones which may have capabilities at or above the human level is of particular concern. One might suspect that artificially generally intelligent (AGI) and artificially superintelligent (ASI) will be systems that humans cannot reliably outsmart. As a challenge to this assumption, this paper presents the Achilles Heel hypothesis which states that even a potentially superintelligent system may nonetheless have stable decision-theoretic delusions which cause them to make irrational decisions in adversarial settings. In a survey of key dilemmas and paradoxes from the decision theory literature, a number of these potential Achilles Heels are discussed in context of this hypothesis. Several novel contributions are made toward understanding the ways in which these weaknesses might be implanted into a system.

翻译：随着人工智能的不断发展，了解先进系统将如何作出选择以及它们可能出现哪些故障非常重要。在某些领域，机器已经可以比人类更加精明，因此了解如何安全地构建可能具有与人类水平相同或以上能力的人工智能和超级人工智能至关重要。虽然人们可能认为具有人工智能和超级人工智能的系统是无法可靠地自欺欺人的，但本文提出了阿喀琉斯之踵假设，即即使是潜在的超级智能系统，也可能出现稳定的决策论幻觉，导致它们在对抗环境下做出非理性的决策。通过对决策理论文献中的关键困境和悖论的调查，本文讨论了这些潜在阿喀琉斯之踵在该假设下的背景。本文还对了解这些弱点可能被植入系统的方式做出了几点新的贡献。

0

相关内容

超级人工智能

超级人工智能

从ChatGPT看AI未来趋势和挑战 | 万字长文

从ChatGPT看AI未来趋势和挑战 | 万字长文

专知会员服务

174+阅读 · 2023年4月18日

兰德报告：防御人工智能对抗性攻击的可行性分析，28页pdf

兰德报告：防御人工智能对抗性攻击的可行性分析，28页pdf

专知会员服务

47+阅读 · 2023年1月13日

【ICDM 2022教程】图挖掘中的公平性:度量、算法和应用

【ICDM 2022教程】图挖掘中的公平性:度量、算法和应用

专知会员服务

28+阅读 · 2022年12月26日

【经典书】数据挖掘：理论、算法与示例，347页pdf，Nong Ye，Arizona State University

【经典书】数据挖掘：理论、算法与示例，347页pdf，Nong Ye，Arizona State University

专知会员服务

82+阅读 · 2020年2月27日

【深度学习表格检测、信息提取和结构化】《Table Detection, Information Extraction and Structuring using Deep Learning》by Vihar Kurama

专知会员服务

38+阅读 · 2020年1月23日

【课程推荐】深度学习中的新兴挑战（Emerging Challenges in Deep Learning）

【课程推荐】深度学习中的新兴挑战（Emerging Challenges in Deep Learning）

专知会员服务

17+阅读 · 2019年11月10日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【推荐】用Python/OpenCV实现增强现实

【推荐】用Python/OpenCV实现增强现实

机器学习研究会

15+阅读 · 2017年11月16日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

中国田鼠亚科 Microtini族(Rodentia: Cricetidae: Arvicolinae)的分类与系统发育研究

国家自然科学基金

0+阅读 · 2014年12月31日

创伤后应激障碍与催产素及其受体通路基因多态性的关联性研究

国家自然科学基金

0+阅读 · 2014年12月31日

Forward-Looking与Backward-Looking相结合的投资组合管理

国家自然科学基金

1+阅读 · 2014年12月31日

长链非编码RNA CAR intergenic 10在细胞衰老中的作用和机制

国家自然科学基金

1+阅读 · 2013年12月31日

长足摇蚊亚科高级阶元系统发育研究

国家自然科学基金

0+阅读 · 2012年12月31日

CD147参与AR调控雄激素非依赖性前列腺癌的作用及机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

拟南芥DIF（DRIP1-Interacting Factor）在胁迫信号应答中的功能分析

国家自然科学基金

0+阅读 · 2012年12月31日

调控Aurora A的microRNA鉴定及其抑制前列腺癌生长研究

国家自然科学基金

0+阅读 · 2011年12月31日

Multi-Agent架构智能机器人推理机实时性研究

国家自然科学基金

1+阅读 · 2011年12月31日

激光表面点阵改性材料摩擦磨损的控制机理

国家自然科学基金

0+阅读 · 2009年12月31日

GUARD: A Safe Reinforcement Learning Benchmark

Arxiv

0+阅读 · 2023年5月23日

On The Empirical Effectiveness of Unrealistic Adversarial Hardening Against Realistic Adversarial Attacks

Arxiv

0+阅读 · 2023年5月22日

A repeated unknown game: Decentralized task offloading in vehicular fog computing

Arxiv

0+阅读 · 2023年5月20日

Bridging Active Exploration and Uncertainty-Aware Deployment Using Probabilistic Ensemble Neural Network Dynamics

Arxiv

0+阅读 · 2023年5月20日

Joint Foundation Model Caching and Inference of Generative AI Services for Edge Intelligence

Arxiv

0+阅读 · 2023年5月20日

A Survey of Decision Making in Adversarial Games

Arxiv

84+阅读 · 2022年7月16日

Explainable Artificial Intelligence for Autonomous Driving: A Comprehensive Overview and Field Guide for Future Research Directions

Arxiv

18+阅读 · 2021年12月21日

On games and simulators as a platform for development of artificial intelligence for command and control

On games and simulators as a platform for development of artificial intelligence for command and control

Arxiv

89+阅读 · 2021年10月21日

Agile, Antifragile, Artificial-Intelligence-Enabled, Command and Control

Arxiv

51+阅读 · 2021年9月14日

Game-Theoretic and Machine Learning-based Approaches for Defensive Deception: A Survey

Arxiv

26+阅读 · 2021年1月21日

VIP会员

文章信息

相关主题

超级人工智能

相关VIP内容

从ChatGPT看AI未来趋势和挑战 | 万字长文

从ChatGPT看AI未来趋势和挑战 | 万字长文

专知会员服务

174+阅读 · 2023年4月18日

兰德报告：防御人工智能对抗性攻击的可行性分析，28页pdf

兰德报告：防御人工智能对抗性攻击的可行性分析，28页pdf

专知会员服务

47+阅读 · 2023年1月13日

【ICDM 2022教程】图挖掘中的公平性:度量、算法和应用

【ICDM 2022教程】图挖掘中的公平性:度量、算法和应用

专知会员服务

28+阅读 · 2022年12月26日

【经典书】数据挖掘：理论、算法与示例，347页pdf，Nong Ye，Arizona State University

【经典书】数据挖掘：理论、算法与示例，347页pdf，Nong Ye，Arizona State University

专知会员服务

82+阅读 · 2020年2月27日

【深度学习表格检测、信息提取和结构化】《Table Detection, Information Extraction and Structuring using Deep Learning》by Vihar Kurama

专知会员服务

38+阅读 · 2020年1月23日

【课程推荐】深度学习中的新兴挑战（Emerging Challenges in Deep Learning）

【课程推荐】深度学习中的新兴挑战（Emerging Challenges in Deep Learning）

专知会员服务

17+阅读 · 2019年11月10日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【AAAI2026】FinRpt：面向证券研究报告生成的数据集、评测体系与基于大语言模型的多智能体框架

美陆军加速采购百万架无人机与激光武器以应对无人机威胁

【CMU博士论文】利用人工智能实现自动化发现

智能的基础：从人类认知视角综述数学文字题研究

相关资讯

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【推荐】用Python/OpenCV实现增强现实

【推荐】用Python/OpenCV实现增强现实

机器学习研究会

15+阅读 · 2017年11月16日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

相关论文

GUARD: A Safe Reinforcement Learning Benchmark

Arxiv

0+阅读 · 2023年5月23日

On The Empirical Effectiveness of Unrealistic Adversarial Hardening Against Realistic Adversarial Attacks

Arxiv

0+阅读 · 2023年5月22日

A repeated unknown game: Decentralized task offloading in vehicular fog computing

Arxiv

0+阅读 · 2023年5月20日

Bridging Active Exploration and Uncertainty-Aware Deployment Using Probabilistic Ensemble Neural Network Dynamics

Arxiv

0+阅读 · 2023年5月20日

Joint Foundation Model Caching and Inference of Generative AI Services for Edge Intelligence

Arxiv

0+阅读 · 2023年5月20日

A Survey of Decision Making in Adversarial Games

Arxiv

84+阅读 · 2022年7月16日

Explainable Artificial Intelligence for Autonomous Driving: A Comprehensive Overview and Field Guide for Future Research Directions

Arxiv

18+阅读 · 2021年12月21日

On games and simulators as a platform for development of artificial intelligence for command and control

On games and simulators as a platform for development of artificial intelligence for command and control

Arxiv

89+阅读 · 2021年10月21日

Agile, Antifragile, Artificial-Intelligence-Enabled, Command and Control

Arxiv

51+阅读 · 2021年9月14日

Game-Theoretic and Machine Learning-based Approaches for Defensive Deception: A Survey

Arxiv

26+阅读 · 2021年1月21日

相关基金

中国田鼠亚科 Microtini族(Rodentia: Cricetidae: Arvicolinae)的分类与系统发育研究

国家自然科学基金

0+阅读 · 2014年12月31日

创伤后应激障碍与催产素及其受体通路基因多态性的关联性研究

国家自然科学基金

0+阅读 · 2014年12月31日

Forward-Looking与Backward-Looking相结合的投资组合管理

国家自然科学基金

1+阅读 · 2014年12月31日

长链非编码RNA CAR intergenic 10在细胞衰老中的作用和机制

国家自然科学基金

1+阅读 · 2013年12月31日

长足摇蚊亚科高级阶元系统发育研究

国家自然科学基金

0+阅读 · 2012年12月31日

CD147参与AR调控雄激素非依赖性前列腺癌的作用及机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

拟南芥DIF（DRIP1-Interacting Factor）在胁迫信号应答中的功能分析

国家自然科学基金

0+阅读 · 2012年12月31日

调控Aurora A的microRNA鉴定及其抑制前列腺癌生长研究

国家自然科学基金

0+阅读 · 2011年12月31日

Multi-Agent架构智能机器人推理机实时性研究

国家自然科学基金

1+阅读 · 2011年12月31日

激光表面点阵改性材料摩擦磨损的控制机理

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员