Many large-scale recommender systems consist of two stages, where the first stage focuses on efficiently generating a small subset of promising candidates from a huge pool of items for the second-stage model to curate final recommendations from. In this paper, we investigate how to ensure group fairness to the items in this two-stage paradigm. In particular, we find that existing first-stage recommenders might select an irrecoverably unfair set of candidates such that there is no hope for the second-stage recommender to deliver fair recommendations. To this end, we propose two threshold-policy selection rules that, given any relevance model of queries and items and a point-wise lower confidence bound on the expected number of relevant items for each policy, find near-optimal sets of candidates that contain enough relevant items in expectation from each group of items. To instantiate the rules, we demonstrate how to derive such confidence bounds from potentially partial and biased user feedback data, which are abundant in many large-scale recommender systems. In addition, we provide both finite-sample and asymptotic analysis of how close the two threshold selection rules are to the optimal thresholds. Beyond this theoretical analysis, we show empirically that these two rules can consistently select enough relevant items from each group while minimizing the size of the candidate sets for a wide range of settings.
翻译:许多大型推荐人系统由两个阶段组成,第一阶段的重点是从第二阶段模式的庞大项目库中从大量项目库中高效产生一小组有希望的候选人,从其中产生一小组有希望的候选人,以便从中推敲最后建议。在本文件中,我们调查如何确保这一两阶段范式项目的集体公平性;特别是,我们发现,现有的第一阶段推荐人可能选择一套无法挽回的不公平候选人,以致第二阶段推荐人无法提出公平建议。为此,我们提出了两项门槛政策甄选规则,考虑到查询和项目的任何相关模式,以及对每项政策相关项目的预期数量所约束的低点信任度,找到每组项目期望中包含足够相关项目的近最佳候选人组。为了回调这些规则,我们证明如何从潜在偏差和偏差的用户反馈数据中获取这种信任。许多大型推荐人系统都有大量这些数据。此外,我们提供了两项关于两个门槛选择规则如何接近最佳门槛的有限和简单分析。除了理论分析之外,我们还从每一组的理论分析之外,以足够广泛的实验性的方式选择了两个候选人项目。