双层建议系统公平性不确定性量化</s> (Uncertainty Quantification for Fairness in Two-Stage Recommender Systems)

Many large-scale recommender systems consist of two stages. The first stage efficiently screens the complete pool of items for a small subset of promising candidates, from which the second-stage model curates the final recommendations. In this paper, we investigate how to ensure group fairness to the items in this two-stage architecture. In particular, we find that existing first-stage recommenders might select an irrecoverably unfair set of candidates such that there is no hope for the second-stage recommender to deliver fair recommendations. To this end, motivated by recent advances in uncertainty quantification, we propose two threshold-policy selection rules that can provide distribution-free and finite-sample guarantees on fairness in first-stage recommenders. More concretely, given any relevance model of queries and items and a point-wise lower confidence bound on the expected number of relevant items for each threshold-policy, the two rules find near-optimal sets of candidates that contain enough relevant items in expectation from each group of items. To instantiate the rules, we demonstrate how to derive such confidence bounds from potentially partial and biased user feedback data, which are abundant in many large-scale recommender systems. In addition, we provide both finite-sample and asymptotic analyses of how close the two threshold selection rules are to the optimal thresholds. Beyond this theoretical analysis, we show empirically that these two rules can consistently select enough relevant items from each group while minimizing the size of the candidate sets for a wide range of settings.

翻译：许多大型推荐人系统由两个阶段组成。第一阶段有效筛选了少数有希望的候选人的全部项目库,第二阶段模式从中确定了最后建议。在本文件中,我们调查了如何确保两阶段架构项目的集体公平性,特别是,我们发现现有第一阶段建议人可能选择一组不可挽回的不公平候选人,这样第二阶段建议人就没有机会提出公平的建议。为此,在不确定性量化方面最近的进展推动下,我们提出了两条门槛政策选择规则,可以在第一阶段建议人中就公平问题提供无分配和无限制的保证。更具体地说,鉴于询问和项目的任何相关模式,以及对每个门槛政策的预期相关项目数目的信任度低一点。我们发现,现有第一阶段建议人可能挑选的一组候选人几乎是最佳的,因此,第二阶段建议人无法指望每个项目有足够的相关项目。为了缓解规则,我们证明如何从潜在偏差和偏差的用户反馈数据中得出这种信任的界限,这些数据在许多大规模建议性建议性建议性建议性建议者中非常丰富。更具体地说,考虑到每个门槛值的每个标准范围,我们从这些集团规则的有限性分析到最接近于最接近的理论性分析。我们又从两个规则都提供了最接近于最接近于最接近的标准项目。</s>