语言模型中的事实知识编辑 (Editing Factual Knowledge in Language Models)

from arxiv, Accepted at EMNLP2021 Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing. Code at https://github.com/nicola-decao/KnowledgeEditor . 16 pages, 6 figures, 2 tables

The factual knowledge acquired during pre-training and stored in the parameters of Language Models (LMs) can be useful in downstream tasks (e.g., question answering or textual inference). However, some facts can be incorrectly induced or become obsolete over time. We present KnowledgeEditor, a method which can be used to edit this knowledge and, thus, fix 'bugs' or unexpected predictions without the need for expensive re-training or fine-tuning. Besides being computationally efficient, KnowledgeEditordoes not require any modifications in LM pre-training (e.g., the use of meta-learning). In our approach, we train a hyper-network with constrained optimization to modify a fact without affecting the rest of the knowledge; the trained hyper-network is then used to predict the weight update at test time. We show KnowledgeEditor's efficacy with two popular architectures and knowledge-intensive tasks: i) a BERT model fine-tuned for fact-checking, and ii) a sequence-to-sequence BART model for question answering. With our method, changing a prediction on the specific wording of a query tends to result in a consistent change in predictions also for its paraphrases. We show that this can be further encouraged by exploiting (e.g., automatically-generated) paraphrases during training. Interestingly, our hyper-network can be regarded as a 'probe' revealing which components need to be changed to manipulate factual knowledge; our analysis shows that the updates tend to be concentrated on a small subset of components. Source code available at https://github.com/nicola-decao/KnowledgeEditor

翻译：培训前获得的事实知识并存储在语言模型参数(LMs)中的事实知识可以用于下游任务(例如问答或文字推断 ) 。但是,有些事实可能会被错误地诱导或随着时间的推移过时。我们展示了“ 知识编辑器”, 这个方法可以用来编辑这种知识, 从而用两种受欢迎的结构和知识密集型任务来修正“ 错误” 或意外预测, 而不需要昂贵的再培训或微调。除了计算效率之外, 知识编辑器不需要对LM 培训前的修改( 例如, 使用元数据学习 ) 。在我们的方法中, 我们训练了一个超网络, 限制优化来修改事实数据, 而不会影响知识的其余部分; 受过训练的超级编辑器, 然后用来预测测试时间的重量更新。我们展示了“ 知识编辑器” 与两个流行的架构和知识密集型任务之间的功效: (i) 一个BERT模型经过微调, 用于核对事实, 和 (ii) 一个小的顺序到后序的解解解解答模式。在我们的方法中, 改变对源码的预测, 在精确的源码中, 里的预测中, 也显示一个精度的精度, 。