Contextual bandit algorithms have many applicants in a variety of scenarios. In order to develop trustworthy contextual bandit systems, understanding the impacts of various adversarial attacks on contextual bandit algorithms is essential. In this paper, we propose a new class of attacks: action poisoning attacks, where an adversary can change the action signal selected by the agent. We design action poisoning attack schemes against linear contextual bandit algorithms in both white-box and black-box settings. We further analyze the cost of the proposed attack strategies for a very popular and widely used bandit algorithm: LinUCB. We show that, in both white-box and black-box settings, the proposed attack schemes can force the LinUCB agent to pull a target arm very frequently by spending only logarithm cost.
翻译:上下文的土匪算法在多种情况下有许多申请人。 为了开发可靠的背景土匪系统, 了解各种对抗性攻击对背景土匪算法的影响至关重要 。 在本文中, 我们提出一个新的攻击类别: 行动中毒攻击, 对手可以改变代理人选择的行动信号 。 我们在白箱和黑盒设置中设计针对线性背景土匪算法的行动中毒攻击计划 。 我们进一步分析一个非常受欢迎和广泛使用的土匪算法( LinUCB ) 的拟议攻击战略的成本 。 我们显示, 在白箱和黑盒设置中, 拟议的攻击计划可以迫使 LinUCB 代理人通过只花费对数成本来频繁拉动目标手臂 。