The content that a recommender system (RS) shows to users influences them. Therefore, when choosing which recommender to deploy, one is implicitly also choosing to induce specific internal states in users. Even more, systems trained via long-horizon optimization will have direct incentives to manipulate users, e.g. shift their preferences so they are easier to satisfy. In this work we focus on induced preference shifts in users. We argue that - before deployment - system designers should: estimate the shifts a recommender would induce; evaluate whether such shifts would be undesirable; and even actively optimize to avoid problematic shifts. These steps involve two challenging ingredients: estimation requires anticipating how hypothetical policies would influence user preferences if deployed - we do this by using historical user interaction data to train predictive user model which implicitly contains their preference dynamics; evaluation and optimization additionally require metrics to assess whether such influences are manipulative or otherwise unwanted - we use the notion of "safe shifts", that define a trust region within which behavior is safe. In simulated experiments, we show that our learned preference dynamics model is effective in estimating user preferences and how they would respond to new recommenders. Additionally, we show that recommenders that optimize for staying in the trust region can avoid manipulative behaviors while still generating engagement.
翻译:推荐者系统(RS) 向用户展示的内容让用户受到其影响。 因此,当选择推荐者(RS) 时, 也可以隐含地选择让用户产生特定的内部状态。 更重要的是, 通过长正正正正优化培训的系统将直接激励用户操作, 例如改变他们的偏好, 让他们更容易满足。 在这项工作中, 我们侧重于用户偏好的变化。 我们认为, 在部署之前, 系统设计者应该: 估计推荐者诱导的转变; 评估这种转变是否不可取; 甚至积极优化以避免有问题的转变。 这些步骤涉及两个具有挑战性的因素: 估计需要预见假设政策如果部署会影响用户偏好用户 — 我们这样做的方法是使用历史用户互动数据来培训包含其偏好动态的预测用户模型; 评估和优化额外要求用量度来评估这些影响是否具有操纵性或者其他不可取性。 我们使用“ 安全转变” 的概念来定义一个行为安全的信任区域。 在模拟实验中, 我们证明我们所学过的偏好动态模型在估计用户偏好和如何回应新的推荐者。 此外, 我们建议, 优化的行为可以避免区域。