Computer vision is widely deployed, has highly visible, society altering applications, and documented problems with bias and representation. Datasets are critical for benchmarking progress in fair computer vision, and often employ broad racial categories as population groups for measuring group fairness. Similarly, diversity is often measured in computer vision datasets by ascribing and counting categorical race labels. However, racial categories are ill-defined, unstable temporally and geographically, and have a problematic history of scientific use. Although the racial categories used across datasets are superficially similar, the complexity of human race perception suggests the racial system encoded by one dataset may be substantially inconsistent with another. Using the insight that a classifier can learn the racial system encoded by a dataset, we conduct an empirical study of computer vision datasets supplying categorical race labels for face images to determine the cross-dataset consistency and generalization of racial categories. We find that each dataset encodes a substantially unique racial system, despite nominally equivalent racial categories, and some racial categories are systemically less consistent than others across datasets. We find evidence that racial categories encode stereotypes, and exclude ethnic groups from categories on the basis of nonconformity to stereotypes. Representing a billion humans under one racial category may obscure disparities and create new ones by encoding stereotypes of racial systems. The difficulty of adequately converting the abstract concept of race into a tool for measuring fairness underscores the need for a method more flexible and culturally aware than racial categories.
翻译:数据集对于在公平计算机愿景中衡量进展的基准至关重要,并经常将广泛的种族类别作为人口群体,以衡量群体公平性。同样,在计算机愿景数据集中,多样性往往通过刻录和计算明确的种族标签来衡量。但种族类别定义不清晰,在时间和地理上不稳定,科学使用的历史也存在问题。尽管各数据集使用的种族类别表面上相似,但人类认知的复杂性表明,由一个数据集编码的种族系统可能与另一个数据集大相径庭。我们利用分类者能够学习由数据集编码的种族系统的洞察力,对计算机愿景数据集进行实证性研究,提供直截的种族标签,用于脸部图像,以确定种族类别的交叉数据一致性和普遍性。我们发现,尽管名义上等同的种族类别相当相似,但每个数据集的种族分类在系统上却比其他数据集更加不具有系统一致性。我们发现,有证据表明,种族类别是陈规定型的,将种族陈规定型观念纳入,并且将种族公平性概念从一个难以理解的类别中排除,在不理解的类别下,将种族陈规定型观念转化为一个难以理解的类别。