Collaborative filtering algorithms capture underlying consumption patterns, including the ones specific to particular demographics or protected information of users, e.g. gender, race, and location. These encoded biases can influence the decision of a recommendation system (RS) towards further separation of the contents provided to various demographic subgroups, and raise privacy concerns regarding the disclosure of users' protected attributes. In this work, we investigate the possibility and challenges of removing specific protected information of users from the learned interaction representations of a RS algorithm, while maintaining its effectiveness. Specifically, we incorporate adversarial training into the state-of-the-art MultVAE architecture, resulting in a novel model, Adversarial Variational Auto-Encoder with Multinomial Likelihood (Adv-MultVAE), which aims at removing the implicit information of protected attributes while preserving recommendation performance. We conduct experiments on the MovieLens-1M and LFM-2b-DemoBias datasets, and evaluate the effectiveness of the bias mitigation method based on the inability of external attackers in revealing the users' gender information from the model. Comparing with baseline MultVAE, the results show that Adv-MultVAE, with marginal deterioration in performance (w.r.t. NDCG and recall), largely mitigates inherent biases in the model on both datasets.
翻译:合作过滤算法可以捕捉基本的消费模式,包括特定人口结构或用户受保护信息特有的消费模式,例如性别、种族和地点。这些编码偏见可以影响一个建议系统(RS)关于进一步分离向各人口分组提供的内容的决定,并引起对披露用户受保护属性的隐私关切。在这项工作中,我们调查将用户特定受保护信息从斯洛文尼亚算法的知情互动表述中去除的可能性和挑战,同时保持其有效性。具体而言,我们将对抗性培训纳入最先进的MultVAE结构,从而形成一个新的模型,即多类近似Aversal Varial自动编码器(Adv-MultVAE),其目的是在保留建议性能的同时消除受保护属性的隐含信息。我们进行电影-MM和LFM-2b-DemoBias数据集实验,并根据外部攻击者无法从模型中披露用户性别信息的情况,评估减少偏见方法的有效性。在MultVA和NGVA的内在性能分析中,将A-Beal-BA的内在性能分析结果与Mutral-deal-deal-BA.