We study the utility of incorporating entity type abstractions into pre-trained Transformers and test these methods on four NLP tasks requiring different forms of logical reasoning: (1) compositional language understanding with text-based relational reasoning (CLUTRR), (2) abductive reasoning (ProofWriter), (3) multi-hop question answering (HotpotQA), and (4) conversational question answering (CoQA). We propose and empirically explore three ways to add such abstraction: (i) as additional input embeddings, (ii) as a separate sequence to encode, and (iii) as an auxiliary prediction task for the model. Overall, our analysis demonstrates that models with abstract entity knowledge performs better than without it. The best abstraction aware models achieved an overall accuracy of 88.8% and 91.8% compared to the baseline model achieving 62.9% and 89.8% on CLUTRR and ProofWriter respectively. However, for HotpotQA and CoQA, we find that F1 scores improve by only 0.5% on average. Our results suggest that the benefit of explicit abstraction is significant in formally defined logical reasoning settings requiring many reasoning hops, but point to the notion that it is less beneficial for NLP tasks having less formal logical structure.
翻译:我们研究将实体类型抽象学纳入培训前的变异器中的效用,并试验这些方法对需要不同逻辑推理形式的四项国家劳工政策任务进行以下测试:(1) 以基于文本的关系推理(CLUTRR)进行组成语言理解的四种逻辑推理(CLUTR)、(2) 绑架推理(ProofWriter)、(3) 多答答(HotpotQA)和(4) 谈话答答(CoQA)的效用。我们提出并经验探讨增加这种抽象学的三种方法:(一) 作为额外的投入嵌入,(二) 作为编码的单独序列,和(三) 作为模型的辅助预测任务。总体而言,我们的分析表明,具有抽象实体知识的模型比没有模型要好。与分别达到CLUTRR和校对62.9%和89.8%的基线模型相比,了解最佳的抽象模型获得了88.8%和91.8%的总体准确度。然而,对于HotpotQA和CQA而言,我们发现F1的得分平均只提高0.5 %。我们发现,明确的抽象学分的好处表明,在正式界定逻辑推论结构需要许多逻辑推理学基础的理论基础较不那么。