Adversarial training has been shown to improve the generalization performance of deep learning models in various natural language processing tasks. Existing works usually formulate adversarial training as a zero-sum game, which is solved by alternating gradient descent/ascent algorithms. Such a formulation treats the adversarial and the defending players equally, which is undesirable because only the defending player contributes to the generalization performance. To address this issue, we propose Stackelberg Adversarial Training (SALT), which formulates adversarial training as a Stackelberg game. This formulation induces a competition between a leader and a follower, where the follower generates perturbations, and the leader trains the model subject to the perturbations. Different from conventional adversarial training, in SALT, the leader is in an advantageous position. When the leader moves, it recognizes the strategy of the follower and takes the anticipated follower's outcomes into consideration. Such a leader's advantage enables us to improve the model fitting to the unperturbed data. The leader's strategic information is captured by the Stackelberg gradient, which is obtained using an unrolling algorithm. Our experimental results on a set of machine translation and natural language understanding tasks show that SALT outperforms existing adversarial training baselines across all tasks.
翻译:Aversarial 培训被证明是为了提高各种自然语言处理任务中深层次学习模式的通用性能。现有作品通常将对抗性培训作为一种零和游戏,通过交替梯度的下行/中度算法加以解决。这种配方对敌对方和辩护方一视同仁,因为只有辩护方才有助于概括性表现,这是不可取的。为了解决这个问题,我们提议Stackelberg Aversarial 培训(SALT),将对抗性培训作为Stackelberg游戏来进行。这种配方在领导者与追随者之间引发竞争,让追随者制造扰动,领导者对模型进行触动。与传统的对抗性培训不同,在SALT中,领导者处于有利地位。当领导者移动时,它承认追随者的战略,并将预期的追随者结果考虑在内。这样的导师的优势使我们能够改进模型与未受扰动的数据相匹配。领导人的战略信息由Stackelberg 梯度进行竞争,由跟踪者制造扰动,而领导者对模型进行受扰动。在Stakelberg lexle,正在使用不动的自动理解的系统上显示我们所有的测试基准任务。