A self-learning adaptive system (SLAS) uses machine learning to enable and enhance its adaptability. Such systems are expected to perform well in dynamic situations. For learning high-performance adaptation policy, some assumptions must be made on the environment-system dynamics when information about the real situation is incomplete. However, these assumptions cannot be expected to be always correct, and yet it is difficult to enumerate all possible assumptions. This leads to the problem of incomplete-information learning. We consider this problem as multiple model problem in terms of finding the adaptation policy that can cope with multiple models of environment-system dynamics. This paper proposes a novel approach to engineering the online adaptation of SLAS. It separates three concerns that are related to the adaptation policy and presents the modeling and synthesis process, with the goal of achieving higher model construction efficiency. In addition, it designs a meta-reinforcement learning algorithm for learning the meta policy over the multiple models, so that the meta policy can quickly adapt to the real environment-system dynamics. At last, it reports the case study on a robotic system to evaluate the adaptability of the approach.
翻译:自学适应系统(SLAS)使用机器学习来扶持和加强其适应能力,这种系统在动态情况下可望运作良好。为了学习高性能适应政策,在真实情况信息不完整时,必须对环境系统动态进行一些假设,但不能预期这些假设总是正确,但很难列举所有可能的假设。这导致了信息学习不全的问题。我们认为,从寻找适应政策能够应对环境系统动态的多种模型的角度来看,这个问题是一个多重模式问题。本文提出了设计系统在线适应的新办法。它分离了三个与适应政策有关的关切,并提出了模型和综合进程,目的是实现更高的模型建设效率。此外,它设计了一个元化强化学习算法,用于学习多重模型的元政策,以便元政策能够迅速适应实际环境系统动态。最后,它报告了关于机器人系统评估方法适应性的案例研究。