解释式和强力国家指数模型的泛层次预测逻辑依据 (Logical Reasoning with Span-Level Predictions for Interpretable and Robust NLI Models)

Current Natural Language Inference (NLI) models achieve impressive results, sometimes outperforming humans when evaluating on in-distribution test sets. However, as these models are known to learn from annotation artefacts and dataset biases, it is unclear to what extent the models are learning the task of NLI instead of learning from shallow heuristics in their training data. We address this issue by introducing a logical reasoning framework for NLI, creating highly transparent model decisions that are based on logical rules. Unlike prior work, we show that improved interpretability can be achieved without decreasing the predictive accuracy. We almost fully retain performance on SNLI, while also identifying the exact hypothesis spans that are responsible for each model prediction. Using the e-SNLI human explanations, we verify that our model makes sensible decisions at a span level, despite not using any span labels during training. We can further improve model performance and span-level decisions by using the e-SNLI explanations during training. Finally, our model is more robust in a reduced data setting. When training with only 1,000 examples, out-of-distribution performance improves on the MNLI matched and mismatched validation sets by 13% and 16% relative to the baseline. Training with fewer observations yields further improvements, both in-distribution and out-of-distribution.

翻译：目前的自然语言推断模型(NLI)取得了令人印象深刻的成果,有时在评价分布式测试组时优于人,然而,由于这些模型已知从批注人工制品和数据集偏差中学习,尚不清楚这些模型在多大程度上学习了NLI的任务,而不是在培训数据中从浅重力中学习。我们通过为NLI引入一个逻辑推理框架来解决这一问题,创建基于逻辑规则的高度透明的模型决定;与以前的工作不同,我们显示在不降低预测准确性的情况下可以实现更好的解释性,我们几乎完全保留SNLI的绩效,同时确定每个模型预测都负责的确切假设范围。我们使用电子-SNLI人类解释,核实我们的模型在跨度上做出明智的决定的程度,尽管培训期间没有使用任何跨度标签。我们可以通过在培训中使用e-SNLI解释来进一步改进模型的绩效和跨度决定。最后,我们的模型在减少数据设置方面更加牢固。在培训时,只有1,000个实例,分配性能改进MLI的绩效,同时确定每一模型的准确度范围,同时确定每个模型的比重度,比重13 % 和16 % 的升级的比差的比差的比差的比差的比差的提高。

相关内容

MoDELS

关注 43

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

专知会员服务

115+阅读 · 2020年4月5日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日