Many methods now exist for conditioning model outputs on task instructions, retrieved documents, and user-provided explanations and feedback. Rather than relying solely on examples of task inputs and outputs, these approaches allow for valuable additional data to be used in modeling with the purpose of improving model correctness and aligning learned models with human priors. Meanwhile, a growing body of evidence suggests that some language models can (1) store a large amount of knowledge in their parameters, and (2) perform inference over tasks in unstructured text to solve new tasks at test time. These results raise the possibility that, for some tasks, humans cannot explain to a model any more about the task than it already knows or could infer on its own. In this paper, we study the circumstances under which explanations of individual data points can (or cannot) improve modeling performance. In order to carefully control important properties of the data and explanations, we introduce a synthetic dataset for experiments, and we also make use of three existing datasets with explanations: e-SNLI, TACRED, SemEval. We first give a formal framework for the available modeling approaches, in which explanation data can be used as model inputs, as labels, or as a prior. After arguing that the most promising role for explanation data is as model inputs, we propose to use a retrieval-based method and show that it solves our synthetic task with accuracies upwards of 95%, while baselines without explanation data achieve below 65% accuracy. We then identify properties of datasets for which retrieval-based modeling fails. With the three existing datasets, we find no improvements from explanation retrieval. Drawing on our findings from our synthetic task, we suggest that at least one of six preconditions for successful modeling fails to hold with these datasets.
翻译:现在有许多方法可以用来根据任务指示、检索到的文件和用户提供的解释和反馈调整模型产出。这些方法不仅依靠任务投入和产出的例子,而且允许在建模中使用宝贵的合成数据,目的是改进模型的正确性并使所学模型与人类前科相一致。与此同时,越来越多的证据表明,一些语言模型可以(1) 储存大量其参数方面的知识,(2) 对非结构化文本中的任务进行推断,以便在测试时解决新的任务。这些结果使人们有可能对某些任务无法对任务作出比自己已经知道或可以推断的错误更多的解释。在本文中,我们研究了解释单个数据点的解释可以(或不能)改进模型的性能。为了仔细控制数据和解释的重要特性,我们引入了一个合成数据集用于实验,并且我们还使用三个现有的数据集来解释:基于模型的模型、TACRED、SemEval。我们首先为现有的模型方法提供一个正式框架,其中的数据可以用来解释比自己已经知道或可能推算得更差。我们的数据解释得更差,然后用一个模型来解释。我们用一个最有希望的数据解释的方法来解释,然后用一个数据的方法来解释。我们用一个新的数据解释,然后用一个标签来显示一个方向,然后用来显示一个有希望的数据解释。