Driven by the need to capture users' evolving interests and optimize their long-term experiences, more and more recommender systems have started to model recommendation as a Markov decision process and employ reinforcement learning to address the problem. Shouldn't research on the fairness of recommender systems follow the same trend from static evaluation and one-shot intervention to dynamic monitoring and non-stop control? In this paper, we portray the recent developments in recommender systems first and then discuss how fairness could be baked into the reinforcement learning techniques for recommendation. Moreover, we argue that in order to make further progress in recommendation fairness, we may want to consider multi-agent (game-theoretic) optimization, multi-objective (Pareto) optimization, and simulation-based optimization, in the general framework of stochastic games.
翻译:由于需要抓住用户不断演变的利益并优化其长期经验,越来越多的推荐者系统开始将建议作为马尔科夫决策程序的模式,并利用强化学习来解决这一问题。关于推荐者系统的公平性的研究不应该遵循从静态评估和一枪干预到动态监测和不停止控制的相同趋势吗? 在本文件中,我们首先描述了推荐者系统中的最新动态,然后讨论了如何将公平性纳入强化学习技术以备建议。此外,我们争辩说,为了在建议公平性方面取得进一步进展,我们不妨考虑多试剂(游戏理论)优化、多目标优化和模拟优化,在随机游戏的总体框架内。