对元加强学习的抽样攻击:微量制式和复杂程度分析 (Sampling Attacks on Meta Reinforcement Learning: A Minimax Formulation and Complexity Analysis) - 专知论文

会员服务 ·

0

Minimax · Learning · Agent · Analysis · 样本 ·

2022 年 7 月 29 日

Sampling Attacks on Meta Reinforcement Learning: A Minimax Formulation and Complexity Analysis

翻译：对元加强学习的抽样攻击:微量制式和复杂程度分析

Tao Li,Haozhe Lei,Quanyan Zhu

Meta reinforcement learning (meta RL), as a combination of meta-learning ideas and reinforcement learning (RL), enables the agent to adapt to different tasks using a few samples. However, this sampling-based adaptation also makes meta RL vulnerable to adversarial attacks. By manipulating the reward feedback from sampling processes in meta RL, an attacker can mislead the agent into building wrong knowledge from training experience, which deteriorates the agent's performance when dealing with different tasks after adaptation. This paper provides a game-theoretical underpinning for understanding this type of security risk. In particular, we formally define the sampling attack model as a Stackelberg game between the attacker and the agent, which yields a minimax formulation. It leads to two online attack schemes: Intermittent Attack and Persistent Attack, which enable the attacker to learn an optimal sampling attack, defined by an $\epsilon$-first-order stationary point, within $\mathcal{O}(\epsilon^{-2})$ iterations. These attack schemes freeride the learning progress concurrently without extra interactions with the environment. By corroborating the convergence results with numerical experiments, we observe that a minor effort of the attacker can significantly deteriorate the learning performance, and the minimax approach can also help robustify the meta RL algorithms.

翻译：元强化学习(meta RL)是元学习理念和强化学习(RL)的结合,它使该代理商能够适应使用少数样本的不同任务。然而,这种基于抽样的适应性调整也使得元RL易受对抗性攻击。通过操纵元RL中取样过程的奖励反馈,攻击者可以误导该代理商从培训经验中积累错误的知识,这在适应后处理不同任务时使该代理商的性能恶化。本文件为了解这种安全风险提供了一种游戏理论基础。特别是,我们正式将抽样攻击模式定义为攻击者与代理商之间的斯塔克尔贝格游戏,产生迷你式的配方。它导致两个在线攻击计划:Interpitt攻击和持久性攻击,使袭击者能够从培训经验中学习最佳的抽样攻击,而培训者在处理不同任务时,在$mathcal{O} (\epsilon ⁇ -2} (\\\\\\\\ 2} ) 里拉特。这些攻击计划可以将学习的进展与环境同时自由,而没有额外的相互作用。它产生微量的配方配方配方的配方的组合。它能能够使微的实验,我们观察到微的实验,通过大大的实验,可以观测微级的累合性的努力。

1

相关内容

Minimax

33页PPT【AI+天气预测】，AI and Machine learning for weather predictions

33页PPT【AI+天气预测】，AI and Machine learning for weather predictions

专知会员服务

35+阅读 · 2022年3月5日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

163+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【新书发布】原作者MarcG.Bellemare发布315页分布强化学习书籍(DistributionalRL)

【新书发布】原作者MarcG.Bellemare发布315页分布强化学习书籍(DistributionalRL)

深度强化学习实验室

1+阅读 · 2022年1月11日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Lnc-TRMT2A竞争性结合miR-520a调控炎性通路在精神分裂症发病中的作用研究

国家自然科学基金

0+阅读 · 2015年12月31日

基于自主学习的Ad hoc Agent序贯决策研究

国家自然科学基金

46+阅读 · 2015年12月31日

保险中两类随机最优控制问题及策略过程概率分布研究

国家自然科学基金

0+阅读 · 2014年12月31日

基于行为的SoS体系结构评价研究

国家自然科学基金

1+阅读 · 2012年12月31日

高阶格式在非定常流动中的若干基础问题研究及在动态特性问题中的应用

国家自然科学基金

0+阅读 · 2011年12月31日

Multi-Agent架构智能机器人推理机实时性研究

国家自然科学基金

1+阅读 · 2011年12月31日

专利h指数与专利信息网络测度研究

国家自然科学基金

1+阅读 · 2011年12月31日

受干扰信号的自适应滤波及信号检测

国家自然科学基金

0+阅读 · 2011年12月31日

miR-124和miR-27对阿尔茨海默病BACE1基因影响的分子机制

国家自然科学基金

0+阅读 · 2011年12月31日

右侧颞顶联合区在注意瞬脱中的门控作用：fMRI、ERP和TMS研究

国家自然科学基金

0+阅读 · 2009年12月31日

Chromatic and spatial analysis of one-pixel attacks against an image classifier

Arxiv

0+阅读 · 2022年9月28日

Stacking Ensemble Learning in Deep Domain Adaptation for Ophthalmic Image Classification

Arxiv

0+阅读 · 2022年9月27日

Enhanced Meta Reinforcement Learning using Demonstrations in Sparse Reward Environments

Arxiv

0+阅读 · 2022年9月26日

Delayed Geometric Discounts: An Alternative Criterion for Reinforcement Learning

Arxiv

0+阅读 · 2022年9月26日

Robust Reinforcement Learning as a Stackelberg Game via Adaptively-Regularized Adversarial Training

Arxiv

0+阅读 · 2022年9月24日

MixTailor: Mixed Gradient Aggregation for Robust Learning Against Tailored Attacks

Arxiv

0+阅读 · 2022年9月23日

Analysis of Reinforcement Learning for determining task replication in workflows

Arxiv

0+阅读 · 2022年9月14日

A Survey on Reinforcement Learning for Recommender Systems

Arxiv

22+阅读 · 2021年9月22日

MetaCURE: Meta Reinforcement Learning with Empowerment-Driven Exploration

Arxiv

12+阅读 · 2021年2月7日

A Multi-Objective Deep Reinforcement Learning Framework

A Multi-Objective Deep Reinforcement Learning Framework

Arxiv

16+阅读 · 2018年6月27日

VIP会员

文章信息

相关主题

相关VIP内容

33页PPT【AI+天气预测】，AI and Machine learning for weather predictions

33页PPT【AI+天气预测】，AI and Machine learning for weather predictions

专知会员服务

35+阅读 · 2022年3月5日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

163+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《在单一作战合成环境（SSE）中运用人工智能与大型语言模型以提供灵活人文地形及可信角色组》报告

《俄罗斯的未来战争方式第二部分：核威慑》报告

《提示战争：大语言模型如何决定军事干预》报告

《俄罗斯的未来战争方式第三部分：军事改革》报告

相关资讯

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【新书发布】原作者MarcG.Bellemare发布315页分布强化学习书籍(DistributionalRL)

【新书发布】原作者MarcG.Bellemare发布315页分布强化学习书籍(DistributionalRL)

深度强化学习实验室

1+阅读 · 2022年1月11日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

Chromatic and spatial analysis of one-pixel attacks against an image classifier

Arxiv

0+阅读 · 2022年9月28日

Stacking Ensemble Learning in Deep Domain Adaptation for Ophthalmic Image Classification

Arxiv

0+阅读 · 2022年9月27日

Enhanced Meta Reinforcement Learning using Demonstrations in Sparse Reward Environments

Arxiv

0+阅读 · 2022年9月26日

Delayed Geometric Discounts: An Alternative Criterion for Reinforcement Learning

Arxiv

0+阅读 · 2022年9月26日

Robust Reinforcement Learning as a Stackelberg Game via Adaptively-Regularized Adversarial Training

Arxiv

0+阅读 · 2022年9月24日

MixTailor: Mixed Gradient Aggregation for Robust Learning Against Tailored Attacks

Arxiv

0+阅读 · 2022年9月23日

Analysis of Reinforcement Learning for determining task replication in workflows

Arxiv

0+阅读 · 2022年9月14日

A Survey on Reinforcement Learning for Recommender Systems

Arxiv

22+阅读 · 2021年9月22日

MetaCURE: Meta Reinforcement Learning with Empowerment-Driven Exploration

Arxiv

12+阅读 · 2021年2月7日

A Multi-Objective Deep Reinforcement Learning Framework

A Multi-Objective Deep Reinforcement Learning Framework

Arxiv

16+阅读 · 2018年6月27日

相关基金

Lnc-TRMT2A竞争性结合miR-520a调控炎性通路在精神分裂症发病中的作用研究

国家自然科学基金

0+阅读 · 2015年12月31日

基于自主学习的Ad hoc Agent序贯决策研究

国家自然科学基金

46+阅读 · 2015年12月31日

保险中两类随机最优控制问题及策略过程概率分布研究

国家自然科学基金

0+阅读 · 2014年12月31日

基于行为的SoS体系结构评价研究

国家自然科学基金

1+阅读 · 2012年12月31日

高阶格式在非定常流动中的若干基础问题研究及在动态特性问题中的应用

国家自然科学基金

0+阅读 · 2011年12月31日

Multi-Agent架构智能机器人推理机实时性研究

国家自然科学基金

1+阅读 · 2011年12月31日

专利h指数与专利信息网络测度研究

国家自然科学基金

1+阅读 · 2011年12月31日

受干扰信号的自适应滤波及信号检测

国家自然科学基金

0+阅读 · 2011年12月31日

miR-124和miR-27对阿尔茨海默病BACE1基因影响的分子机制

国家自然科学基金

0+阅读 · 2011年12月31日

右侧颞顶联合区在注意瞬脱中的门控作用：fMRI、ERP和TMS研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员