Correctly resolving textual mentions of people fundamentally entails making inferences about those people. Such inferences raise the risk of systemic biases in coreference resolution systems, including biases that can harm binary and non-binary trans and cis stakeholders. To better understand such biases, we foreground nuanced conceptualizations of gender from sociology and sociolinguistics, and develop two new datasets for interrogating bias in crowd annotations and in existing coreference resolution systems. Through these studies, conducted on English text, we confirm that without acknowledging and building systems that recognize the complexity of gender, we build systems that lead to many potential harms.
翻译:正确解决文本中提及的人从根本上意味着对这些人进行推断。这种推论增加了在共同参考解析系统中出现系统性偏见的风险,包括可能伤害二进制和非二进制切换和切分解利益攸关方的偏见。为了更好地理解这些偏见,我们从社会学和社会语言学的角度对性别问题进行深思熟虑,并开发两个新的数据集,用于在人群说明和现有的共同参考解析系统中调查偏见。通过这些对英文文本进行的研究,我们确认,在不确认和建设承认性别复杂性的系统的情况下,我们建立导致许多潜在伤害的系统。