In this paper we motivate the causal mechanisms behind sample selection induced collider bias (selection collider bias) that can cause Large Language Models (LLMs) to learn unconditional dependence between entities that are unconditionally independent in the real world. We show that selection collider bias can become amplified in underspecified learning tasks, and although difficult to overcome, we describe a method to exploit the resulting spurious correlations for determination of when a model may be uncertain about its prediction. We demonstrate an uncertainty metric that matches human uncertainty in tasks with gender pronoun underspecification on an extended version of the Winogender Schemas evaluation set, and we provide online demos where users can evaluate spurious correlations and apply our uncertainty metric to their own texts and models. Finally, we generalize our approach to address a wider range of prediction tasks.
翻译:在本文中,我们激励抽样选择背后的因果机制,引发了可能导致大语言模型(LLMs)在现实世界中无条件独立的实体之间产生无条件依赖的对撞机制。我们表明,选择对撞机制的偏向可以在未详细说明的学习任务中扩大,虽然难以克服,但我们描述了一种方法,利用由此产生的虚假关联来确定一个模型的预测何时可能不确定。我们展示了一种不确定的衡量尺度,在延长Winogender Schemas评价组的性别定型方面,将人类的不确定性与性别定型不足相匹配,我们提供在线演示,用户可以在其中评价虚假的关联性,并将我们的不确定度度量度用于自己的文本和模型。最后,我们概括了我们处理更广泛的预测任务的方法。