Adversarial regularization has been shown to improve the generalization performance of deep learning models in various natural language processing tasks. Existing works usually formulate the method as a zero-sum game, which is solved by alternating gradient descent/ascent algorithms. Such a formulation treats the adversarial and the defending players equally, which is undesirable because only the defending player contributes to the generalization performance. To address this issue, we propose Stackelberg Adversarial Regularization (SALT), which formulates adversarial regularization as a Stackelberg game. This formulation induces a competition between a leader and a follower, where the follower generates perturbations, and the leader trains the model subject to the perturbations. Different from conventional approaches, in SALT, the leader is in an advantageous position. When the leader moves, it recognizes the strategy of the follower and takes the anticipated follower's outcomes into consideration. Such a leader's advantage enables us to improve the model fitting to the unperturbed data. The leader's strategic information is captured by the Stackelberg gradient, which is obtained using an unrolling algorithm. Our experimental results on a set of machine translation and natural language understanding tasks show that SALT outperforms existing adversarial regularization baselines across all tasks. Our code is available at https://github.com/SimiaoZuo/Stackelberg-Adv.
翻译:Adversarial 正规化已证明可以提高各种自然语言处理任务中深层次学习模式的普遍性表现。 现有作品通常将方法发展成零和游戏, 由交替的梯度下降/ 偏差算法解决。 这种配方对敌对方和辩护方一视同仁, 因为只有辩护方才对一般化表现作出贡献, 这是不可取的。 为了解决这个问题, 我们提议Stackelberg Adversarial 正规化 (SALT), 它将对抗性规范化作为Stackelberg 游戏。 这种配方在领导者与追随者之间引发一种竞争, 由追随者制造扰动, 领导者将模型训练为扰动对象。 在SALT 中, 领导者与传统方法不同, 处于有利地位。 当领导者移动时, 它会承认追随者的战略, 并会考虑预期的追随者结果。 这样一位领导人的优势使我们能够改进模型, 与不固定的数据匹配。 领导者的战略信息通过Stacelberg 梯度的梯度来采集, 。 正在获得的 Stakelbergbergerberg train sublexlation 。 我们的Srationalendationalal laview laview laviews