Pre-trained language models (LMs) often struggle to reason logically or generalize in a compositional fashion. Recent work suggests that incorporating external entity knowledge can improve LMs' abilities to reason and generalize. However, the effect of explicitly providing entity abstraction remains unclear, especially with recent studies suggesting that pre-trained LMs already encode some of that knowledge in their parameters. We study the utility of incorporating entity type abstractions into pre-trained Transformers and test these methods on four NLP tasks requiring different forms of logical reasoning: (1) compositional language understanding with text-based relational reasoning (CLUTRR), (2) abductive reasoning (ProofWriter), (3) multi-hop question answering (HotpotQA), and (4) conversational question answering (CoQA). We propose and empirically explore three ways to add such abstraction: (i) as additional input embeddings, (ii) as a separate sequence to encode, and (iii) as an auxiliary prediction task for the model. Overall, our analysis demonstrates that models with abstract entity knowledge performs better than without it. However, our experiments also show that the benefits strongly depend on the technique used and the task at hand. The best abstraction aware models achieved an overall accuracy of 88.8% and 91.8% compared to the baseline model achieving 62.3% and 89.8% on CLUTRR and ProofWriter respectively. In addition, abstraction-aware models showed improved compositional generalization in both interpolation and extrapolation settings. However, for HotpotQA and CoQA, we find that F1 scores improve by only 0.5% on average. Our results suggest that the benefit of explicit abstraction is significant in formally defined logical reasoning settings requiring many reasoning hops, but point to the notion that it is less beneficial for NLP tasks having less formal logical structure.
翻译:培训前语言模型(LMS)往往难以在逻辑上解释,或以组成方式概括解释。最近的工作表明,吸收外部实体知识可以提高LMS理性和概括能力。然而,明确提供实体抽象化的效果仍然不明确,特别是最近研究表明,培训前LMS已经将部分知识编码在其参数中。我们研究将实体类型抽象化纳入培训前变异器并测试这些方法的效用,这需要不同形式的逻辑推理:(1) 采用基于文本的关系推理(CLUTRR)的构成语言理解,(2) 模拟推理(ProofWriter),(3) 多动回答(HotpotQA),(4) 谈话回答(COQA)。我们提议并探索三种方法来添加这种抽象化:(一)作为额外的投入嵌入,(二)作为编码的单独序列,以及(三)作为模型的辅助性预测任务。总体而言,我们的分析表明,具有抽象实体知识的模型比没有我们得到更好的表现。然而,我们的实验还表明,对于技术的精确性A值分别表明C8的精确度和精确度,在实际的推算中,对于CLIL8的精确度的推算,当然的推算中,也更没有多少。