“牙医是尽责的家长，酒保则不是”：通过隐式BBQ揭示问答系统中的隐性偏见 ("The Dentist is an involved parent, the bartender is not": Revealing Implicit Biases in QA with Implicit BBQ)

Existing benchmarks evaluating biases in large language models (LLMs) primarily rely on explicit cues, declaring protected attributes like religion, race, gender by name. However, real-world interactions often contain implicit biases, inferred subtly through names, cultural cues, or traits. This critical oversight creates a significant blind spot in fairness evaluation. We introduce ImplicitBBQ, a benchmark extending the Bias Benchmark for QA (BBQ) with implicitly cued protected attributes across 6 categories. Our evaluation of GPT-4o on ImplicitBBQ illustrates troubling performance disparity from explicit BBQ prompts, with accuracy declining up to 7% in the "sexual orientation" subcategory and consistent decline located across most other categories. This indicates that current LLMs contain implicit biases undetected by explicit benchmarks. ImplicitBBQ offers a crucial tool for nuanced fairness evaluation in NLP.

翻译：现有评估大型语言模型（LLM）偏见的基准主要依赖显性线索，即直接声明受保护属性（如宗教、种族、性别）的名称。然而，现实世界中的互动常包含隐性偏见，这些偏见通过姓名、文化线索或特征被微妙地推断出来。这一关键疏漏在公平性评估中造成了显著的盲区。我们提出了ImplicitBBQ，这是一个扩展自问答偏见基准（BBQ）的基准，在6个类别中引入了基于隐性线索的受保护属性。我们在ImplicitBBQ上对GPT-4o的评估显示，与显性BBQ提示相比，其表现存在令人担忧的差异：在“性取向”子类别中准确率下降高达7%，且在大多数其他类别中也观察到一致的下降。这表明当前LLM中存在未被显性基准检测到的隐性偏见。ImplicitBBQ为自然语言处理领域的细致公平性评估提供了一个关键工具。