Information access research (and development) sometimes makes use of gender, whether to report on the demographics of participants in a user study, as inputs to personalized results or recommendations, or to make systems gender-fair, amongst other purposes. This work makes a variety of assumptions about gender, however, that are not necessarily aligned with current understandings of what gender is, how it should be encoded, and how a gender variable should be ethically used. In this work, we present a systematic review of papers on information retrieval and recommender systems that mention gender in order to document how gender is currently being used in this field. We find that most papers mentioning gender do not use an explicit gender variable, but most of those that do either focus on contextualizing results of model performance, personalizing a system based on assumptions of user gender, or auditing a model's behavior for fairness or other privacy-related issues. Moreover, most of the papers we review rely on a binary notion of gender, even if they acknowledge that gender cannot be split into two categories. We connect these findings with scholarship on gender theory and recent work on gender in human-computer interaction and natural language processing. We conclude by making recommendations for ethical and well-grounded use of gender in building and researching information access systems.
翻译:信息获取研究(和开发)有时会利用性别,无论是报告用户研究参与者的人口统计,作为对个人化结果或建议的投入,还是使系统具有性别公平性等目的。这项工作对性别作出各种假设,但是,这些假设不一定与目前对什么是性别、如何将性别编码、如何使用性别变数的理解相一致。在这项工作中,我们系统地审查关于信息检索和建议系统的文件,其中提到性别,以便记录该领域目前如何使用性别。我们发现,大多数提到性别的文件没有使用明确的性别变量,但大多数不是侧重于模型业绩结果的背景化、基于用户性别假设的系统个性化、或审计一种模式的公平行为或其他与隐私有关的问题。此外,我们审查的大多数文件都依赖于性别的二进制概念,即使它们承认性别不能分为两类。我们将这些研究结果与性别理论奖学金和最近在建立人体计算机互动和自然语言处理过程中的性别工作联系起来。我们通过利用伦理和性别信息系统来得出获取建议。