Attacking Neural Machine Translation models is an inherently combinatorial task on discrete sequences, solved with approximate heuristics. Most methods use the gradient to attack the model on each sample independently. Instead of mechanically applying the gradient, could we learn to produce meaningful adversarial attacks ? In contrast to existing approaches, we learn to attack a model by training an adversarial generator based on a language model. We propose the Masked Adversarial Generation (MAG) model, that learns to perturb the translation model throughout the training process. The experiments show that it improves the robustness of machine translation models, while being faster than competing methods.
翻译:攻击神经机器翻译模型是分立序列的固有组合任务,由大约的超自然学解决。 多数方法使用梯度独立攻击每个样本的模型。 我们是否学会使用梯度来产生有意义的对抗性攻击? 与现有方法相比,我们学会通过培训基于语言模型的对立生成器来攻击模型。 我们建议使用蒙面反向代(MAG)模型,该模型在整个培训过程中都能对翻译模型进行干扰。 实验显示,它能提高机器翻译模型的强度,同时比相互竞争的方法更快。