Pairwise comparisons based on human judgements are an effective method for determining rankings of items or individuals. However, as human biases perpetuate from pairwise comparisons to recovered rankings, they affect algorithmic decision making. In this paper, we introduce the problem of fairness-aware ranking recovery from pairwise comparisons. We propose a group-conditioned accuracy measure which quantifies fairness of rankings recovered from pairwise comparisons. We evaluate the impact of state-of-the-art ranking recovery algorithms and sampling approaches on accuracy and fairness of the recovered rankings, using synthetic and empirical data. Our results show that Fairness-Aware PageRank and GNNRank with FA*IR post-processing effectively mitigate existing biases in pairwise comparisons and improve the overall accuracy of recovered rankings. We highlight limitations and strengths of different approaches, and provide a Python package to facilitate replication and future work on fair ranking recovery from pairwise comparisons.
翻译:暂无翻译