Reinforcement learning (RL) is one of the most important branches of AI. Due to its capacity for self-adaption and decision-making in dynamic environments, reinforcement learning has been widely applied in multiple areas, such as healthcare, data markets, autonomous driving, and robotics. However, some of these applications and systems have been shown to be vulnerable to security or privacy attacks, resulting in unreliable or unstable services. A large number of studies have focused on these security and privacy problems in reinforcement learning. However, few surveys have provided a systematic review and comparison of existing problems and state-of-the-art solutions to keep up with the pace of emerging threats. Accordingly, we herein present such a comprehensive review to explain and summarize the challenges associated with security and privacy in reinforcement learning from a new perspective, namely that of the Markov Decision Process (MDP). In this survey, we first introduce the key concepts related to this area. Next, we cover the security and privacy issues linked to the state, action, environment, and reward function of the MDP process, respectively. We further highlight the special characteristics of security and privacy methodologies related to reinforcement learning. Finally, we discuss the possible future research directions within this area.
翻译:强化学习(RL)是AI最重要的分支之一。由于其在动态环境中的自我适应和决策能力,强化学习已广泛应用于多个领域,如医疗保健、数据市场、自主驾驶和机器人等。然而,其中一些应用和系统被证明容易受到安保或隐私攻击,导致服务不可靠或不稳定。大量研究侧重于强化学习过程中的这些安全和隐私问题。然而,很少有调查对现有问题和最新解决方案进行系统审查和比较,以跟上新出现的威胁的步伐。因此,我们在此提出这样的全面审查,从新的角度,即马尔科夫决策程序(MDP)的角度,解释和概述在加强学习过程中与安全和隐私有关的挑战。我们首先介绍与这一领域有关的关键概念。接下来,我们分别介绍与加强强化学习有关的州、行动、环境和奖励职能有关的安全和隐私问题。我们进一步强调与强化学习有关的安全和隐私方法的特殊性。最后,我们讨论该领域未来可能的研究方向。