Neural language models (LMs) have achieved impressive results on various language-based reasoning tasks by utilizing latent knowledge encoded in their own pretrained parameters. To make this reasoning process more explicit, recent works retrieve a rationalizing LM's internal knowledge by training or prompting it to generate free-text rationales, which can be used to guide task predictions made by either the same LM or a separate reasoning LM. However, rationalizing LMs require expensive rationale annotation and/or computation, without any assurance that their generated rationales improve LM task performance or faithfully reflect LM decision-making. In this paper, we propose PINTO, an LM pipeline that rationalizes via prompt-based learning, and learns to faithfully reason over rationales via counterfactual regularization. First, PINTO maps out a suitable reasoning process for the task input by prompting a frozen rationalizing LM to generate a free-text rationale. Second, PINTO's reasoning LM is fine-tuned to solve the task using the generated rationale as context, while regularized to output less confident predictions when the rationale is perturbed. Across four datasets, we show that PINTO significantly improves the generalization ability of the reasoning LM, yielding higher performance on both in-distribution and out-of-distribution test sets. Also, we find that PINTO's rationales are more faithful to its task predictions than those generated by competitive baselines.
翻译:神经语言模型在各种基于语言的推理任务中实现了令人印象深刻的结果,通过利用预训练参数中编码的隐含知识来实现。为了使这一推理过程更加明确,最近的工作通过训练或提示生成自由文本理由来检索证明语言模型(LM)的内部知识,这些理由可以用于指导同样的LM或单独的推理LM进行任务预测。但是,理性模型需要昂贵的证明注释和 / 或计算,而没有任何保证它们生成的理由会提高LM任务表现或忠实地反映LM决策过程。在本文中,我们提出了PINTO,这是一种通过基于提示的学习进行理性化的LM管道,并通过反事实正则化学习忠实地推理理由。首先,PINTO通过提示非常理性化的LM生成自由文本理由,为任务输入映射出合适的推理过程。其次,使用生成的理由作为上下文精细调整PINTO的推理LM,同时通过正则化使其输出较少的置信度预测,当理由受到扰动时。在四个数据集上,我们发现PINTO显着提高了推理LM的泛化能力,使在分布和分布之外的测试集上都获得了更高的性能。此外,我们发现PINTO的理由比竞争基线生成的理由更忠实于其任务预测。