深强化学习中的白对白政策 (White-Box Adversarial Policies in Deep Reinforcement Learning)

Adversarial examples against AI systems pose both risks via malicious attacks and opportunities for improving robustness via adversarial training. In multiagent settings, adversarial policies can be developed by training an adversarial agent to minimize a victim agent's rewards. Prior work has studied black-box attacks where the adversary only sees the state observations and effectively treats the victim as any other part of the environment. In this work, we experiment with white-box adversarial policies to study whether an agent's internal state can offer useful information for other agents. We make three contributions. First, we introduce white-box adversarial policies in which an attacker can observe a victim's internal state at each timestep. Second, we demonstrate that white-box access to a victim makes for better attacks in two-agent environments, resulting in both faster initial learning and higher asymptotic performance against the victim. Third, we show that training against white-box adversarial policies can be used to make learners in single-agent environments more robust to domain shifts.

翻译：反对AI系统的反面例子既构成恶意攻击的风险,也带来通过对抗性培训提高稳健性的机会。在多种试剂环境下,可以通过培训敌对方来制定对抗性政策,以尽量减少受害者代理人的报酬。以前的工作研究过黑箱攻击,对手只看到国家观察,而实际上把受害者当作环境的任何其他部分来对待。在这项工作中,我们试验白箱对抗性政策,以研究一个代理人的内部状态能否为其他代理人提供有用的信息。我们作出了三项贡献。首先,我们采用了白箱对抗性政策,攻击者可以在其中每一步观察受害者的内部状态。第二,我们证明向受害人提供的白箱接触有助于在两个试剂环境中进行更好的攻击,从而导致更快的初始学习和对受害者的更高程度的治疗性表现。第三,我们表明,针对白箱对抗性对抗性政策的培训可以用来使单一代理人环境中的学习者更强大地进行地区转移。

相关内容

白盒

关注 0

白盒测试（也称为透明盒测试，玻璃盒测试，透明盒测试和结构测试）是一种软件测试方法，用于测试应用程序的内部结构或功能，而不是其功能（即黑盒测试）。在白盒测试中，系统的内部视角以及编程技能被用来设计测试用例。测试人员选择输入以遍历代码的路径并确定预期的输出。这类似于测试电路中的节点，在线测试（ICT）。白盒测试可以应用于软件测试过程的单元，集成和系统级别。尽管传统的测试人员倾向于将白盒测试视为在单元级别进行的，但如今它已越来越频繁地用于集成和系统测试。它可以测试单元内的路径，集成期间单元之间的路径以及系统级测试期间子系统之间的路径。

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日