We propose an effective consistency training framework that enforces a training model's predictions given original and perturbed inputs to be similar by adding a discrete noise that would incur the highest divergence between predictions. This virtual adversarial discrete noise obtained by replacing a small portion of tokens while keeping original semantics as much as possible efficiently pushes a training model's decision boundary. Moreover, we perform an iterative refinement process to alleviate the degraded fluency of the perturbed sentence due to the conditional independence assumption. Experimental results show that our proposed method outperforms other consistency training baselines with text editing, paraphrasing, or a continuous noise on semi-supervised text classification tasks and a robustness benchmark.
翻译:我们建议一个有效的一致性培训框架,通过增加离散噪音,对原始投入和扰动投入进行类似的培训模式预测,从而实现类似。这种虚拟对抗性离散噪音是通过替换一小部分象征性物而获得的,同时尽可能有效地保持原有语义,从而尽可能有效地推动培训模式的决定界限。此外,我们还进行迭代完善,以减轻因有条件独立假设而使受扰动的判刑退化的流畅性。实验结果表明,我们拟议的方法优于其他一致性培训基线,包括文字编辑、副词转换或半监督文本分类任务的持续噪音和稳健性基准。