In this paper, we explore the robustness of the Multi-Task Deep Neural Networks (MT-DNN) against non-targeted adversarial attacks across Natural Language Understanding (NLU) tasks as well as some possible ways to defend against them. Liu et al., have shown that the Multi-Task Deep Neural Network, due to the regularization effect produced when training as a result of its cross task data, is more robust than a vanilla BERT model trained only on one task (1.1%-1.5% absolute difference). We further show that although the MT-DNN has generalized better, making it easily transferable across domains and tasks, it can still be compromised as after only 2 attacks (1-character and 2-character) the accuracy drops by 42.05% and 32.24% for the SNLI and SciTail tasks. Finally, we propose a domain agnostic defense which restores the model's accuracy (36.75% and 25.94% respectively) as opposed to a general-purpose defense or an off-the-shelf spell checker.
翻译:在本文中,我们探讨了多任务深神经网络(MT-DNN)对于跨越自然语言理解(NLU)任务的非目标对抗性攻击以及针对这些攻击的一些可能的防御方法的稳健性。 Liu等人已经表明,多任务深神经网络由于其交叉任务数据培训产生的正规化效应,比仅就一项任务(1.1%-1.5%绝对差异)所培训的香草BERT模型(1.1%-1.5%)更加强大。 我们还进一步表明,尽管MT-DNN已经推广得更好,使其易于在各种领域和任务之间转移,但它仍然有可能受到损害,因为只有两次攻击(1个字符和2个字符)之后,SNLI和SciTail任务的精度下降了42.05 %和32.24%。 最后,我们提议了恢复模型准确性(分别为36.75%和25.94%)的域名防御,而不是一般用途防御或脱壳检查器。