The ability of pretrained Transformers to remember factual knowledge is essential but still limited for existing models. Inspired by existing work that regards Feed-Forward Networks (FFNs) in Transformers as key-value memories, we design a Neural Knowledge Bank (NKB) and a knowledge injection strategy to introduce extra factual knowledge for pretrained Transformers. The NKB is in the form of additional knowledgeable memory slots to the FFN and the memory-like architecture makes it highly interpretable and flexible. When injecting extra knowledge with the Salient Span Masking (SSM) pretraining objective, we fix the original pretrained model and train only the NKB. This training strategy makes sure the general language modeling ability of the original pretrained model is not influenced. By mounting the NKB onto the T5 model, we verify its strong ability to store extra factual knowledge based on three closed-book question answering datasets. Also, we prove that mounting the NKB will not degrade the general language modeling ability of T5 through two representative tasks, summarization and machine translation. Further, we thoroughly analyze the interpretability of the NKB and reveal the meaning of its keys and values in a human-readable way. Finally, we show the flexibility of the NKB by directly modifying its value vectors to update the factual knowledge stored in it.
翻译:培训前的变换者记忆真实知识的能力对于现有模型来说至关重要,但对于现有模型来说仍然有限。在将变换者中的Feed-Forward Networks(FFNs)视为关键价值记忆的现有工作启发下,我们设计了一个神经知识库(NKB)和知识注入战略,为预先培训的变换者引入额外事实知识。NKB的形式是向FFFN和类似记忆的架构提供额外的知情记忆槽,使得它具有高度可解释性和灵活性。在向SSSM(SSSSM)预培训目标注入额外知识时,我们只修补原预先培训的模型,并训练NKB。这一培训战略确保最初经过培训的模式的通用语言建模能力不会受到影响。通过将NKB加到T5模型中,我们核查其根据三个解密回答数据集的问题储存额外事实知识的强大能力。此外,我们证明NKB通过两个具有代表性的任务、总结和机器翻译,不会降低T5的一般语言建模能力。此外,我们彻底分析最初的模型模型的可解释性,并直接显示其卡内基文件的更新价值的含义。