Many large-scale recommender systems consist of two stages, where the first stage focuses on efficiently generating a small subset of promising candidates from a huge pool of items for the second-stage model to curate final recommendations from. In this paper, we investigate how to ensure groups fairness to the items in this two-stage paradigm. In particular, we find that existing first-stage recommenders might select an irrecoverably unfair set of candidates such that there is no hope for the second-stage recommender to deliver fair recommendations. To this end, we propose two threshold-policy selection rules that, given any relevance model of queries and items and a point-wise lower confidence bound on the expected number of relevant items for each policy, find near-optimal sets of candidates that contain enough relevant items in expectation from each group of items. To instantiate the rules, we demonstrate how to derive such confidence bounds from potentially partial and biased user feedback data, which are abundant in many large-scale recommender systems. In addition, we provide both finite-sample and asymptotic analysis of how close the two threshold selection rules are to the optimal thresholds. Beyond this theoretical analysis, we show empirically that these two rules can consistently select enough relevant items from each group while minimizing the size of the candidate sets for a wide range of settings.
翻译:许多大型推荐人系统由两个阶段组成,第一阶段的重点是从第二阶段模式的庞大项目库中从大量项目库中高效产生一小组有希望的候选人,从其中产生一小组有希望的候选人,以便从中推敲最后建议。在本文件中,我们调查如何确保群体公平对待这一两阶段范式中的项目。特别是,我们发现现有第一阶段推荐人可能选择一组无法挽回的不公平候选人,这样第二阶段推荐人就无望提出公平建议。为此,我们提出了两个门槛政策甄选规则,考虑到查询和项目的任何相关模式,以及对每项政策相关项目的预期数量所约束的低点信任度,找到每组项目中包含足够相关项目的近最佳候选人组。要对规则进行回旋,我们证明如何从潜在偏差和偏差的用户反馈数据中获取这种信任的界限,因为许多大型推荐人系统都有大量这些数据。此外,我们提供了两种最低限度的门槛选择规则如何接近最佳临界值。除了理论分析之外,我们还从每个组的每个组中持续地选择了两个最广泛的候选项目的经验范围。