In this paper, we provide a moral analysis of two criteria of statistical fairness debated in the machine learning literature: 1) calibration between groups and 2) equality of false positive and false negative rates between groups. In our paper, we focus on moral arguments in support of either measure. The conflict between group calibration vs. false positive and false negative rate equality is one of the core issues in the debate about group fairness definitions among practitioners. For any thorough moral analysis, the meaning of the term fairness has to be made explicit and defined properly. For our paper, we equate fairness with (non-)discrimination, which is a legitimate understanding in the discussion about group fairness. More specifically, we equate it with prima facie wrongful discrimination in the sense this is used in Prof. Lippert-Rasmussen's treatment of this definition. In this paper, we argue that a violation of group calibration may be unfair in some cases, but not unfair in others. This is in line with claims already advanced in the literature, that algorithmic fairness should be defined in a way that is sensitive to context. The most important practical implication is that arguments based on examples in which fairness requires between-group calibration, or equality in the false-positive/false-negative rates, do no generalize. For it may be that group calibration is a fairness requirement in one case, but not in another.
翻译:在本文中,我们对机器学习文献中辩论的统计公平的两个标准进行了道德分析:1)群体之间校准,2)群体之间虚假正反率的平等。在我们的论文中,我们侧重于支持任一措施的道德论据。群体校准与虚假正反率平等之间的冲突是实践者之间关于群体公平定义的辩论中的核心问题之一。对于任何彻底的道德分析,公平一词的含义都必须明确和适当界定。对于我们的论文,我们把公平与(非)歧视等同起来,这是关于群体公平的讨论中的一种合理理解。更具体地说,我们将其等同于初步看来的错误歧视,因为利珀特-拉斯穆森教授对待这一定义时使用了这种观点。在本文中,我们争论说,在某些情况下,违反群体校准可能不公平,但在另一些情况下,这种违反群体校准可能与文献中已经提出的主张相一致,即对算法公正应当以对背景敏感的方式加以界定。最重要的实际含义是,根据一些例子提出的论点,即公平性要求不能在群体之间校准率,而是在一般情况下,对一个群体加以校准。