Improving model generalization on held-out data is one of the core objectives in commonsense reasoning. Recent work has shown that models trained on the dataset with superficial cues tend to perform well on the easy test set with superficial cues but perform poorly on the hard test set without superficial cues. Previous approaches have resorted to manual methods of encouraging models not to overfit to superficial cues. While some of the methods have improved performance on hard instances, they also lead to degraded performance on easy instances. Here, we propose to explicitly learn a model that does well on both the easy test set with superficial cues and hard test set without superficial cues. Using a meta-learning objective, we learn such a model that improves performance on both the easy test set and the hard test set. By evaluating our models on Choice of Plausible Alternatives (COPA) and Commonsense Explanation, we show that our proposed method leads to improved performance on both the easy test set and the hard test set upon which we observe up to 16.5 percentage points improvement over the baseline.
翻译:改进搁置数据模型的概括性是常识推理的核心目标之一。最近的工作表明,以肤浅的线索对数据集进行训练的模型往往在简易测试中表现良好,使用浅浅的提示,但没有肤浅的提示,在硬测试中表现不佳。以前的做法采用人工方法鼓励模型,不过分适应浅浅的提示。虽然有些方法改进了困难实例的性能,但也导致简单实例的性能下降。在这里,我们提议明确学习一种模型,既在简易测试中采用浅浅浅的提示,又采用没有肤浅提示的硬测试集,效果良好。我们利用一个元化学习目标,学会了这样一种模型,既能改进简易测试集的性能,又能改进硬测试集的性能。我们通过评价我们关于选择可塑替代品和常识解的模型,表明我们提出的方法可以改进简单测试集和硬测试集的性能,我们看到比基线改进了16.5个百分点。