使用未知动态的在线RL的脆弱易感系统致毒机制 (Vulnerability-Aware Poisoning Mechanism for Online RL with Unknown Dynamics) - 专知论文

会员服务 ·

0

学成 · 在线 · 评论员 · Processing（编程语言） · 回合 ·

2021 年 5 月 4 日

Vulnerability-Aware Poisoning Mechanism for Online RL with Unknown Dynamics

翻译：使用未知动态的在线RL的脆弱易感系统致毒机制

Yanchao Sun,Da Huo,Furong Huang

Poisoning attacks on Reinforcement Learning (RL) systems could take advantage of RL algorithm's vulnerabilities and cause failure of the learning. However, prior works on poisoning RL usually either unrealistically assume the attacker knows the underlying Markov Decision Process (MDP), or directly apply the poisoning methods in supervised learning to RL. In this work, we build a generic poisoning framework for online RL via a comprehensive investigation of heterogeneous poisoning models in RL. Without any prior knowledge of the MDP, we propose a strategic poisoning algorithm called Vulnerability-Aware Adversarial Critic Poison (VA2C-P), which works for most policy-based deep RL agents, closing the gap that no poisoning method exists for policy-based RL agents. VA2C-P uses a novel metric, stability radius in RL, that measures the vulnerability of RL algorithms. Experiments on multiple deep RL agents and multiple environments show that our poisoning algorithm successfully prevents agents from learning a good policy or teaches the agents to converge to a target policy, with a limited attacking budget.

翻译：对强化学习系统进行毒害袭击可能利用RL算法的脆弱性,并造成学习失败。然而,先前的中毒RL工作通常不切实际地假设攻击者知道基本的Markov决策程序(MDP),或者直接在监督下对RL进行学习时采用中毒方法。在这项工作中,我们通过对RL的多种不同中毒模式进行全面调查,为在线RL建立一个普通中毒框架。在对MDP不事先了解的情况下,我们提议了一种称为脆弱性-Aware Aware Aversarial Critial Courty(VA2C-P)的战略中毒算法(VA2C-P),该算法对大多数基于政策的深度RL代理起作用,缩小了基于政策的RL代理没有中毒方法存在的差距。 VA2C-P在RL使用一种新的指标、稳定性半径,以测量RL算法的脆弱性。对多个深度RL代理和多个环境的实验表明,我们的中毒算法成功地阻止了代理人学习好的政策,或教给代理人学习目标政策,而攻击预算有限。

0

相关内容

【机器学习工具箱(机器学习实用库分类大列表)】《Machine Learning Toolbox》by Amit Chaudhary

【机器学习工具箱(机器学习实用库分类大列表)】《Machine Learning Toolbox》by Amit Chaudhary

专知会员服务

30+阅读 · 2020年7月12日

【Google】平滑对抗训练，Smooth Adversarial Training

【Google】平滑对抗训练，Smooth Adversarial Training

专知会员服务

49+阅读 · 2020年7月4日

【ACL2020】对抗性文本生成，Improving Adversarial Text Generation

专知会员服务

52+阅读 · 2020年5月5日

【伯克利】黑盒机器翻译系统的模仿攻击与防御，Imitation Attacks and Defenses for Black-box Machine Translation Systems

【伯克利】黑盒机器翻译系统的模仿攻击与防御，Imitation Attacks and Defenses for Black-box Machine Translation Systems

专知会员服务

7+阅读 · 2020年5月4日

【基于模型的强化学习的博弈论框架】A Game Theoretic Framework for Model Based Reinforcement Learning

【基于模型的强化学习的博弈论框架】A Game Theoretic Framework for Model Based Reinforcement Learning

专知会员服务

131+阅读 · 2020年4月19日

深度强化学习策略梯度教程，53页ppt

深度强化学习策略梯度教程，53页ppt

专知会员服务

184+阅读 · 2020年2月1日

【ICCV 2019】基于元学习的自动化神经网络通道 MetaPruning: Meta Learning for Automatic Neural Network Channel Pruning

【ICCV 2019】基于元学习的自动化神经网络通道 MetaPruning: Meta Learning for Automatic Neural Network Channel Pruning

专知会员服务

17+阅读 · 2019年11月17日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

已删除

将门创投

5+阅读 · 2019年8月19日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

分布式TensorFlow入门指南

分布式TensorFlow入门指南

机器学习研究会

4+阅读 · 2017年11月28日

Learning to Generate Noise for Multi-Attack Robustness

Arxiv

0+阅读 · 2021年6月24日

Emergent Social Learning via Multi-agent Reinforcement Learning

Arxiv

0+阅读 · 2021年6月22日

Off-Policy Reinforcement Learning with Delayed Rewards

Off-Policy Reinforcement Learning with Delayed Rewards

Arxiv

0+阅读 · 2021年6月22日

Boosting Offline Reinforcement Learning with Residual Generative Modeling

Arxiv

1+阅读 · 2021年6月22日

Composite Adversarial Attacks

Arxiv

12+阅读 · 2020年12月10日

Contrastive Learning with Adversarial Examples

Arxiv

5+阅读 · 2020年10月22日

Model-based Adversarial Meta-Reinforcement Learning

Arxiv

5+阅读 · 2020年6月16日

Sequential Attacks on Agents for Long-Term Adversarial Goals

Sequential Attacks on Agents for Long-Term Adversarial Goals

Arxiv

5+阅读 · 2018年7月5日

Adversarial Meta-Learning

Arxiv

7+阅读 · 2018年6月8日

Learning to Adapt: Meta-Learning for Model-Based Control

Arxiv

9+阅读 · 2018年3月30日

VIP会员

文章信息

相关主题

Processing（编程语言）

相关VIP内容

【机器学习工具箱(机器学习实用库分类大列表)】《Machine Learning Toolbox》by Amit Chaudhary

【机器学习工具箱(机器学习实用库分类大列表)】《Machine Learning Toolbox》by Amit Chaudhary

专知会员服务

30+阅读 · 2020年7月12日

【Google】平滑对抗训练，Smooth Adversarial Training

【Google】平滑对抗训练，Smooth Adversarial Training

专知会员服务

49+阅读 · 2020年7月4日

【ACL2020】对抗性文本生成，Improving Adversarial Text Generation

专知会员服务

52+阅读 · 2020年5月5日

【伯克利】黑盒机器翻译系统的模仿攻击与防御，Imitation Attacks and Defenses for Black-box Machine Translation Systems

【伯克利】黑盒机器翻译系统的模仿攻击与防御，Imitation Attacks and Defenses for Black-box Machine Translation Systems

专知会员服务

7+阅读 · 2020年5月4日

【基于模型的强化学习的博弈论框架】A Game Theoretic Framework for Model Based Reinforcement Learning

【基于模型的强化学习的博弈论框架】A Game Theoretic Framework for Model Based Reinforcement Learning

专知会员服务

131+阅读 · 2020年4月19日

深度强化学习策略梯度教程，53页ppt

深度强化学习策略梯度教程，53页ppt

专知会员服务

184+阅读 · 2020年2月1日

【ICCV 2019】基于元学习的自动化神经网络通道 MetaPruning: Meta Learning for Automatic Neural Network Channel Pruning

【ICCV 2019】基于元学习的自动化神经网络通道 MetaPruning: Meta Learning for Automatic Neural Network Channel Pruning

专知会员服务

17+阅读 · 2019年11月17日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《乌克兰无人机产业：志愿者与政策在构建新兴无人机产业中的协同作用》最新报告

《人工智能辅助决策中的数据可视化：系统性综述》

人工智能驱动弹药制造现代化：美国陆军转型之路

《敏捷作战部署中枢纽-辐条基地选址优化研究》80页

相关资讯

已删除

将门创投

5+阅读 · 2019年8月19日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

分布式TensorFlow入门指南

分布式TensorFlow入门指南

机器学习研究会

4+阅读 · 2017年11月28日

相关论文

Learning to Generate Noise for Multi-Attack Robustness

Arxiv

0+阅读 · 2021年6月24日

Emergent Social Learning via Multi-agent Reinforcement Learning

Arxiv

0+阅读 · 2021年6月22日

Off-Policy Reinforcement Learning with Delayed Rewards

Off-Policy Reinforcement Learning with Delayed Rewards

Arxiv

0+阅读 · 2021年6月22日

Boosting Offline Reinforcement Learning with Residual Generative Modeling

Arxiv

1+阅读 · 2021年6月22日

Composite Adversarial Attacks

Arxiv

12+阅读 · 2020年12月10日

Contrastive Learning with Adversarial Examples

Arxiv

5+阅读 · 2020年10月22日

Model-based Adversarial Meta-Reinforcement Learning

Arxiv

5+阅读 · 2020年6月16日

Sequential Attacks on Agents for Long-Term Adversarial Goals

Sequential Attacks on Agents for Long-Term Adversarial Goals

Arxiv

5+阅读 · 2018年7月5日

Adversarial Meta-Learning

Arxiv

7+阅读 · 2018年6月8日

Learning to Adapt: Meta-Learning for Model-Based Control

Arxiv

9+阅读 · 2018年3月30日

微信扫码咨询专知VIP会员