Safe learning and optimization deals with learning and optimization problems that avoid, as much as possible, the evaluation of non-safe input points, which are solutions, policies, or strategies that cause an irrecoverable loss (e.g., breakage of a machine or equipment, or life threat). Although a comprehensive survey of safe reinforcement learning algorithms was published in 2015, a number of new algorithms have been proposed thereafter, and related works in active learning and in optimization were not considered. This paper reviews those algorithms from a number of domains including reinforcement learning, Gaussian process regression and classification, evolutionary algorithms, and active learning. We provide the fundamental concepts on which the reviewed algorithms are based and a characterization of the individual algorithms. We conclude by explaining how the algorithms are connected and suggestions for future research.
翻译:安全学习和优化涉及学习和优化问题,这些问题尽可能避免评估不安全输入点,即造成不可挽回损失的解决办法、政策或战略(如机器或设备的破损或生命威胁)。虽然2015年公布了安全强化学习算法综合调查,但此后提出了若干新的算法,没有考虑积极学习和优化方面的相关工作。本文审查了从若干领域(包括强化学习、高山进程回归和分类、演化算法和积极学习)得出的这些算法。我们提供了所审查的算法所依据的基本概念和个别算法的定性。我们最后通过解释算法是如何相互联系的,并提出未来研究的建议。