Researchers have devised numerous ways to quantify social biases vested in pretrained language models. As some language models are capable of generating coherent completions given a set of textual prompts, several prompting datasets have been proposed to measure biases between social groups -- posing language generation as a way of identifying biases. In this opinion paper, we analyze how specific choices of prompt sets, metrics, automatic tools and sampling strategies affect bias results. We find out that the practice of measuring biases through text completion is prone to yielding contradicting results under different experiment settings. We additionally provide recommendations for reporting biases in open-ended language generation for a more complete outlook of biases exhibited by a given language model. Code to reproduce the results is released under https://github.com/feyzaakyurek/bias-textgen.
翻译:研究人员设计了多种方法来量化在经过培训的语言模式中存在的社会偏见,由于一些语言模式能够以一系列文字提示的方式取得一致的完成,因此提出了若干快速的数据集,以衡量社会群体之间的偏见 -- -- 以语言生成作为识别偏见的一种方式。在本意见文件中,我们分析了迅速的成套选择、衡量标准、自动工具和抽样战略的具体选择如何影响偏见结果。我们发现,通过完成文本衡量偏见的做法很容易在不同试验环境中产生相互矛盾的结果。我们进一步建议报告不开放语言一代中的偏见,以便更完整地看待特定语言模式所显示的偏见。复制结果的代码在https://github.com/feyzaakyurek/bias-textgen下发布。