In this paper we motivate the causal mechanisms behind sample selection induced collider bias (selection collider bias) that can cause Large Language Models (LLMs) to learn unconditional dependence between entities that are unconditionally independent in the real world. We show that selection collider bias can be amplified in underspecified learning tasks, and that the magnitude of the resulting spurious correlations appear scale agnostic. While selection collider bias can be difficult to overcome, we describe a method to exploit the resulting spurious correlations for determination of when a model may be uncertain about its prediction, and demonstrate that it matches human uncertainty in tasks with gender pronoun underspecification on an extended version of the Winogender Schemas evaluation set.
翻译:在本文中,我们激励抽样选择背后的因果机制,即可能导致大语言模型(LLMs)学习在现实世界中无条件独立的实体之间无条件依赖的对撞机制(选择对撞偏见 ) 。 我们表明选择对撞机制的偏向可以在定义不足的学习任务中扩大,由此产生的虚假关联的规模看起来是不可知的。 虽然选择对撞偏见可能难以克服,但我们描述了一种方法,用以利用由此产生的虚假关联来确定一个模型何时可能不确定其预测,并表明它与在扩大的Winogender Schemas评价组中性别预言不足的任务中的人类不确定性相匹配。