Recently, commonsense knowledge models - pretrained language models (LM) fine-tuned on knowledge graph (KG) tuples - showed that considerable amounts of commonsense knowledge can be encoded in the parameters of large language models. However, as parallel studies show that LMs are poor hypothesizers of declarative commonsense relationships on their own, it remains unclear whether this knowledge is learned during pretraining or from fine-tuning on KG examples. To investigate this question, we train commonsense knowledge models in few-shot settings to study the emergence of their commonsense representation abilities. Our results show that commonsense knowledge models can rapidly adapt from limited examples, indicating that KG fine-tuning serves to learn an interface to encoded knowledge learned during pretraining. Importantly, our analysis of absolute, angular, and distributional parameter changes during few-shot fine-tuning provides novel insights into how this interface is learned.
翻译:最近,常识知识模型――经过预先培训的语言模型(LM)对知识图图(KG)进行微调的精通语言模型(LM)显示,大量常识知识可以被编入大型语言模型参数中,然而,平行研究表明,LM本身是宣言性常识关系虚伪的虚伪化器,但目前还不清楚这种知识是在培训前还是对KG实例进行微调过程中学到的。为了调查这一问题,我们在几发环境中培训了常识知识模型,以研究其常识代表能力的出现。我们的结果显示,常识知识模型可以从有限的例子中迅速适应,表明KG微调有助于学习将预训期间学到的知识编码的界面。重要的是,我们在微调时对绝对、角和分布参数变化的分析提供了如何学习这种界面的新洞察力。