Free-form rationales aim to aid model interpretability by supplying the background knowledge that can help understand model decisions. Crowdsourced rationales are provided for commonsense QA instances in popular datasets such as CoS-E and ECQA, but their utility remains under-investigated. We present human studies which show that ECQA rationales indeed provide additional background information to understand a decision, while over 88% of CoS-E rationales do not. Inspired by this finding, we ask: can the additional context provided by free-form rationales benefit models, similar to human users? We investigate the utility of rationales as an additional source of supervision, by varying the quantity and quality of rationales during training. After controlling for instances where rationales leak the correct answer while not providing additional background knowledge, we find that incorporating only 5% of rationales during training can boost model performance by 47.22% for CoS-E and 57.14% for ECQA during inference. Moreover, we also show that rationale quality matters: compared to crowdsourced rationales, T5-generated rationales provide not only weaker supervision to models, but are also not helpful for humans in aiding model interpretability.
翻译:自由形式的理由说明旨在通过提供有助于理解示范决定的背景知识来帮助模型解释。在诸如COS-E和ECQA等流行数据集中,为常识质询质询实例提供了众源理由说明,但其效用仍然调查不足。我们提出的人类研究显示,ECQA理由确实提供了额外的背景资料,以理解一项决定,而超过88%的COS-E理由说明没有这样做。受这一结论的启发,我们问:自由形式理由惠益模型所提供的额外背景是否与人类用户相似?我们调查理由作为额外监督来源的效用,在培训期间,通过改变理由的数量和质量。在控制理由泄露正确答案而不提供额外背景知识的情况之后,我们发现,在培训期间只纳入5%的理由可以提高示范性能47.22%的COS-E理由和ECQA的57.14%。此外,我们还表明,与众源理由说明理由相比,T5产生的理由理由说明不仅对模型进行较弱的监督,而且对模型的可靠性也无帮助。