As Recommender Systems (RS) influence more and more people in their daily life, the issue of fairness in recommendation is becoming more and more important. Most of the prior approaches to fairness-aware recommendation have been situated in a static or one-shot setting, where the protected groups of items are fixed, and the model provides a one-time fairness solution based on fairness-constrained optimization. This fails to consider the dynamic nature of the recommender systems, where attributes such as item popularity may change over time due to the recommendation policy and user engagement. For example, products that were once popular may become no longer popular, and vice versa. As a result, the system that aims to maintain long-term fairness on the item exposure in different popularity groups must accommodate this change in a timely fashion. Novel to this work, we explore the problem of long-term fairness in recommendation and accomplish the problem through dynamic fairness learning. We focus on the fairness of exposure of items in different groups, while the division of the groups is based on item popularity, which dynamically changes over time in the recommendation process. We tackle this problem by proposing a fairness-constrained reinforcement learning algorithm for recommendation, which models the recommendation problem as a Constrained Markov Decision Process (CMDP), so that the model can dynamically adjust its recommendation policy to make sure the fairness requirement is always satisfied when the environment changes. Experiments on several real-world datasets verify our framework's superiority in terms of recommendation performance, short-term fairness, and long-term fairness.
翻译:由于建议系统(RS)在日常生活中影响越来越多的人,建议中的公平性问题越来越重要,建议中的公平性问题越来越重要。以前对公平意识建议的多数做法都处于静态或一拍即成的环境下,受保护的物品群体是固定的,模型提供了基于公平限制优化的一次性公平性解决办法。这没有考虑到建议系统的动态性质,因为项目受欢迎性等属性可能随时间而因建议政策和用户参与而变化。例如,曾经受欢迎的产品可能不再受到欢迎,反之亦然。因此,旨在在不同受欢迎群体中保持项目受照的长期公平性的系统必须及时适应这一变化。对于这项工作来说,我们探索建议的长期公平性问题,并通过动态公平性学习解决问题。我们注重不同群体中项目受欢迎程度的公平性,而项目受欢迎程度则基于项目受欢迎性,在建议过程中会发生动态变化。我们提出公平性强的学习模型来解决这个问题,在建议中,长期性数据实验性建议中总是能满足环境要求。