用单语数据中毒对神经机器翻译进行有针对性的攻击 (Putting words into the system's mouth: A targeted attack on neural machine translation using monolingual data poisoning)

Neural machine translation systems are known to be vulnerable to adversarial test inputs, however, as we show in this paper, these systems are also vulnerable to training attacks. Specifically, we propose a poisoning attack in which a malicious adversary inserts a small poisoned sample of monolingual text into the training set of a system trained using back-translation. This sample is designed to induce a specific, targeted translation behaviour, such as peddling misinformation. We present two methods for crafting poisoned examples, and show that only a tiny handful of instances, amounting to only 0.02% of the training set, is sufficient to enact a successful attack. We outline a defence method against said attacks, which partly ameliorates the problem. However, we stress that this is a blind-spot in modern NMT, demanding immediate attention.

翻译：众所周知,神经机器翻译系统很容易受到对抗性测试输入,然而,正如我们在本文中所表明的那样,这些系统也容易受到训练攻击。具体地说,我们提议进行中毒攻击,恶意敌人在训练中使用反译法的系统训练组中插入少量单语文字的有毒样本。这个样本旨在诱导一种特定的、有针对性的翻译行为,例如兜售错误信息。我们提出了两种方法来编篡有毒的例子,并表明只有为数不多的几例(仅占训练组的0.02%)足以实施成功的攻击。我们概述了一种防御方法来对付上述攻击,这在一定程度上缓解了问题。然而,我们强调这是现代NMT的盲点,需要立即关注。

相关内容

Machine Translation

关注 209

机器翻译（Machine Translation）涵盖计算语言学和语言工程的所有分支，包含多语言方面。特色论文涵盖理论，描述或计算方面的任何下列主题:双语和多语语料库的编写和使用，计算机辅助语言教学，非罗马字符集的计算含义，连接主义翻译方法，对比语言学等。官网地址：http://dblp.uni-trier.de/db/journals/mt/

《中国信创产业发展白皮书（2021）》发布, 34页pdf

专知会员服务

89+阅读 · 2021年3月3日

不可错过！UIUC最新《对抗机器学习》课程，附PPT

专知会员服务

35+阅读 · 2020年12月28日

迁移学习简明教程，11页ppt

专知会员服务

108+阅读 · 2020年8月4日

【伯克利】黑盒机器翻译系统的模仿攻击与防御，Imitation Attacks and Defenses for Black-box Machine Translation Systems

专知会员服务

7+阅读 · 2020年5月4日