Large Transformer models have achieved impressive performance in many natural language tasks. In particular, Transformer based language models have been shown to have great capabilities in encoding factual knowledge in their vast amount of parameters. While the tasks of improving the memorization and generalization of Transformers have been widely studied, it is not well known how to make transformers forget specific old facts and memorize new ones. In this paper, we propose a new task of \emph{explicitly modifying specific factual knowledge in Transformer models while ensuring the model performance does not degrade on the unmodified facts}. This task is useful in many scenarios, such as updating stale knowledge, protecting privacy, and eliminating unintended biases stored in the models. We benchmarked several approaches that provide natural baseline performances on this task. This leads to the discovery of key components of a Transformer model that are especially effective for knowledge modifications. The work also provides insights into the role that different training phases (such as pretraining and fine-tuning) play towards memorization and knowledge modification.
翻译:大型变压器模型在许多自然语言任务中取得了令人印象深刻的成绩。 特别是, 以变压器为基础的语言模型在将大量参数的事实知识编码方面表现出巨大的能力。 虽然改进变压器的记忆化和概括化的任务已经进行了广泛的研究,但人们并不知道如何使变压器忘记具体的旧事实和新事实。 在本文件中,我们提议一项新的任务,即明确修改变压器模型中的具体事实知识,同时确保变压器模型的性能不会因未经修改的事实而下降。 这项任务在许多设想中非常有用,例如更新变压器知识,保护隐私,以及消除在模型中储存的意外偏差。 我们设定了几种方法的基准,这些方法为这项工作提供了自然基准绩效。 这导致发现变压器模型的关键组成部分,对于知识的修改特别有效。 这项工作还揭示了不同培训阶段(如预先培训和微调)在记忆化和知识修改方面所起的作用。