Social bias in Pretrained Language Models (PLMs) affects text generation and other downstream NLP tasks. Existing bias testing methods rely predominantly on manual templates or on expensive crowd-sourced data. We propose a novel AutoBiasTest method that automatically generates sentences for testing bias in PLMs, hence providing a flexible and low-cost alternative. Our approach uses another PLM for generation and controls the generation of sentences by conditioning on social group and attribute terms. We show that generated sentences are natural and similar to human-produced content in terms of word length and diversity. We illustrate that larger models used for generation produce estimates of social bias with lower variance. We find that our bias scores are well correlated with manual templates, but AutoBiasTest highlights biases not captured by these templates due to more diverse and realistic test sentences. By automating large-scale test sentence generation, we enable better estimation of underlying bias distributions
翻译:现有的偏见测试方法主要依靠人工模板或昂贵的众源数据。我们建议了一个新的AutoBiasTest方法,该方法自动生成用于测试PLM中的偏见的句子,从而提供了一种灵活和低成本的替代方法。我们的方法是利用另一个PLM来生成,并根据社会群体和属性术语来控制生成的句子。我们表明,生成的句子是自然的,在字长和多样性方面与人类生成的内容相似。我们说明,用于生成的较大模型可以产生差异较小的社会偏见估计数。我们发现,我们的偏见分数与手动模板密切相关,但是AutoBiasTest强调这些模板由于更加多样化和现实的测试句子而没有包含的偏见。我们通过对大规模测试生成进行自动化,可以更好地估算潜在的偏见分布。