Current Natural Language Inference (NLI) models achieve impressive results, sometimes outperforming humans when evaluating on in-distribution test sets. However, as these models are known to learn from annotation artefacts and dataset biases, it is unclear to what extent the models are learning the task of NLI instead of learning from shallow heuristics in their training data. We address this issue by introducing a logical reasoning framework for NLI, creating highly transparent model decisions that are based on logical rules. Unlike prior work, we show that improved interpretability can be achieved without decreasing the predictive accuracy. We almost fully retain performance on SNLI, while also identifying the exact hypothesis spans that are responsible for each model prediction. Using the e-SNLI human explanations, we verify that our model makes sensible decisions at a span level, despite not using any span labels during training. We can further improve model performance and span-level decisions by using the e-SNLI explanations during training. Finally, our model is more robust in a reduced data setting. When training with only 1,000 examples, out-of-distribution performance improves on the MNLI matched and mismatched validation sets by 13% and 16% relative to the baseline. Training with fewer observations yields further improvements, both in-distribution and out-of-distribution.
翻译:目前的自然语言推断模型(NLI)取得了令人印象深刻的成果,有时在评价分布式测试组时优于人,然而,由于这些模型已知从批注人工制品和数据集偏差中学习,尚不清楚这些模型在多大程度上学习了NLI的任务,而不是在培训数据中从浅重力中学习。我们通过为NLI引入一个逻辑推理框架来解决这一问题,创建基于逻辑规则的高度透明的模型决定;与以前的工作不同,我们显示在不降低预测准确性的情况下可以实现更好的解释性,我们几乎完全保留SNLI的绩效,同时确定每个模型预测都负责的确切假设范围。我们使用电子-SNLI人类解释,核实我们的模型在跨度上做出明智的决定的程度,尽管培训期间没有使用任何跨度标签。我们可以通过在培训中使用e-SNLI解释来进一步改进模型的绩效和跨度决定。最后,我们的模型在减少数据设置方面更加牢固。在培训时,只有1,000个实例,分配性能改进MLI的绩效,同时确定每一模型的准确度范围,同时确定每个模型的比重度,比重13 % 和16 % 的升级的比差的比差的比差的比差的比差的提高。