How do we design measures of social bias that we trust? While prior work has introduced several measures, no measure has gained widespread trust: instead, mounting evidence argues we should distrust these measures. In this work, we design bias measures that warrant trust based on the cross-disciplinary theory of measurement modeling. To combat the frequently fuzzy treatment of social bias in NLP, we explicitly define social bias, grounded in principles drawn from social science research. We operationalize our definition by proposing a general bias measurement framework DivDist, which we use to instantiate 5 concrete bias measures. To validate our measures, we propose a rigorous testing protocol with 8 testing criteria (e.g. predictive validity: do measures predict biases in US employment?). Through our testing, we demonstrate considerable evidence to trust our measures, showing they overcome conceptual, technical, and empirical deficiencies present in prior measures.
翻译:我们如何设计我们信任的社会偏见的衡量标准? 尽管以前的工作已经引入了几项措施,但没有任何措施获得了广泛的信任:相反,越来越多的证据表明,我们应该不信任这些措施。在这项工作中,我们设计了基于跨学科的衡量模型理论值得信任的偏向措施。为了打击NLP中经常模糊的对待社会偏见的做法,我们根据社会科学研究的原则,明确界定了社会偏见的定义。我们通过提出一般的偏向衡量框架DivDist来落实我们的定义,我们用这个框架来抵消5项具体的偏向措施。为了验证我们的措施,我们提出了一个严格的测试协议,有8项测试标准(例如预测有效性:措施是否预测美国就业中的偏向? ) 。 通过我们的测试,我们展示了相当的证据来信任我们的措施,表明它们克服了先前措施中存在的概念、技术和经验上的缺陷。