In this paper, we study the problem of deceptive reinforcement learning to preserve the privacy of a reward function. Reinforcement learning is the problem of finding a behaviour policy based on rewards received from exploratory behaviour. A key ingredient in reinforcement learning is a reward function, which determines how much reward (negative or positive) is given and when. However, in some situations, we may want to keep a reward function private; that is, to make it difficult for an observer to determine the reward function used. We define the problem of privacy-preserving reinforcement learning, and present two models for solving it. These models are based on dissimulation -- a form of deception that `hides the truth'. We evaluate our models both computationally and via human behavioural experiments. Results show that the resulting policies are indeed deceptive, and that participants can determine the true reward function less reliably than that of an honest agent.
翻译:在本文中,我们研究了欺骗性的强化学习问题,以维护奖励功能的隐私。强化学习是寻找基于探索行为所获奖励的行为政策的问题。强化学习的一个关键要素是奖赏功能,它决定给予多少奖赏(消极或积极)以及何时给予。然而,在某些情况下,我们可能希望保持奖励功能的私人性质;也就是说,使观察员难以确定所使用的奖赏功能。我们界定了保护隐私的强化学习问题,并提出了解决该问题的两种模式。这些模式是以模拟为基础的——一种“掩盖真相”的欺骗形式。我们从计算上或通过人类行为实验来评估我们的模型。结果显示,由此产生的政策确实具有欺骗性,参与者可以比诚实的代理人更不可靠地确定真正的奖赏功能。