Research on adversarial attacks are becoming widely popular in the recent years. One of the unexplored areas where prior research is lacking is the effect of adversarial attacks on code-mixed data. Therefore, in the present work, we have explained the first generalized framework on text perturbation to attack code-mixed classification models in a black-box setting. We rely on various perturbation techniques that preserve the semantic structures of the sentences and also obscure the attacks from the perception of a human user. The present methodology leverages the importance of a token to decide where to attack by employing various perturbation strategies. We test our strategies on various sentiment classification models trained on Bengali-English and Hindi-English code-mixed datasets, and reduce their F1-scores by nearly 51 % and 53 % respectively, which can be further reduced if a larger number of tokens are perturbed in a given sentence.
翻译:近些年来,关于对抗性攻击的研究越来越普遍。先前缺乏研究的尚未探索的领域之一是对编码混合数据进行对抗性攻击的影响。因此,在目前的工作中,我们解释了第一个关于文字扰动的一般性框架,以便在黑箱环境中攻击编码混合的分类模型。我们依赖各种干扰技术,这些技术保存了判决的语义结构,也模糊了对人体用户的印象。目前的方法利用象征性物的重要性,通过采用各种扰动策略决定攻击地点。我们测试了我们对孟加拉-英语和印地语-英语编码混合数据集培训的各种情绪分类模型的战略,将其F1分数分别减少近51%和53%,如果在给定的句子中碰到更多的物证,这些F1分数可以进一步减少。