Large pre-trained language models such as BERT have been widely used as a framework for natural language understanding (NLU) tasks. However, recent findings have revealed that pre-trained language models are insensitive to word order. The performance on NLU tasks remains unchanged even after randomly permuting the word of a sentence, where crucial syntactic information is destroyed. To help preserve the importance of word order, we propose a simple approach called Forced Invalidation (FI): forcing the model to identify permuted sequences as invalid samples. We perform an extensive evaluation of our approach on various English NLU and QA based tasks over BERT-based and attention-based models over word embeddings. Our experiments demonstrate that Forced Invalidation significantly improves the sensitivity of the models to word order.
翻译:摘要:大型预训练语言模型,例如BERT,已被广泛用作自然语言理解(NLU)任务的框架。然而,最近的研究发现,预训练语言模型对词序不敏感。即使在句子单词随机排列并破坏关键的句法信息后,NLU任务的表现仍然保持不变。为了帮助保留词序的重要性,我们提出了一个简单的方法,称为强制无效化(FI):强制模型将排列的序列识别为无效样本。我们在基于BERT和基于注意力的模型以及基于词嵌入的各种英语自然语言理解和问答任务上对我们的方法进行了广泛的评估。我们的实验表明,强制无效化显著提高了模型对词序的灵敏性。