In this paper, we cast the problem of task underspecification in causal terms, and develop a method for empirical measurement of spurious associations between gender and gender-neutral entities for unmodified large language models, detecting previously unreported spurious correlations. We then describe a lightweight method to exploit the resulting spurious associations for prediction task uncertainty classification, achieving over 90% accuracy on a Winogender Schemas challenge set. Finally, we generalize our approach to address a wider range of prediction tasks and provide open-source demos for each method described here.
翻译:在本文中,我们从因果角度提出了任务区分过低的问题,并制定了一种方法,用于对性别实体与性别中立实体之间虚假联系进行经验性衡量,以建立未经修改的大语言模型,发现先前未报告的虚假关联。然后我们描述了一种轻量级方法,以利用由此形成的虚假协会进行预测任务不确定性分类,在Winogender Schemas提出的一套挑战上达到90%的精确度。最后,我们概括了我们处理范围更广的预测任务的方法,并为这里所描述的每一种方法提供开放源的演示。