强化学习的 " 射击 " 系统识别 (Few Shot System Identification for Reinforcement Learning)

Learning by interaction is the key to skill acquisition for most living organisms, which is formally called Reinforcement Learning (RL). RL is efficient in finding optimal policies for endowing complex systems with sophisticated behavior. All paradigms of RL utilize a system model for finding the optimal policy. Modeling dynamics can be done by formulating a mathematical model or system identification. Dynamic models are usually exposed to aleatoric and epistemic uncertainties that can divert the model from the one acquired and cause the RL algorithm to exhibit erroneous behavior. Accordingly, the RL process sensitive to operating conditions and changes in model parameters and lose its generality. To address these problems, Intensive system identification for modeling purposes is needed for each system even if the model dynamics structure is the same, as the slight deviation in the model parameters can render the model useless in RL. The existence of an oracle that can adaptively predict the rest of the trajectory regardless of the uncertainties can help resolve the issue. The target of this work is to present a framework for facilitating the system identification of different instances of the same dynamics class by learning a probability distribution of the dynamics conditioned on observed data with variational inference and show its reliability in robustly solving different instances of control problems with the same model in model-based RL with maximum sample efficiency.

翻译：互动学习是大多数活生物体获得技能的关键,这被正式称为“强化学习”。RL在寻找优化复杂系统的最佳政策方面是效率很高的。RL的所有范式都使用一个系统模型来寻找最佳政策。模型动态可以通过制定数学模型或系统识别来进行。动态模型通常暴露在可转移模型与获得模型之间的偏差和感知性不确定性中,并导致RL算法出现错误行为。因此,RL过程敏感于操作条件和模型参数变化,并失去其普遍性。为了解决这些问题,需要为每个系统为建模目的进行强化系统识别,即使模型动态结构相同,因为模型参数的轻微偏差会使模型在RL中变得毫无用处。动态模型的存在能够适应性地预测轨道的其余部分,而不论不确定性如何,都有助于解决问题。这项工作的目标是提供一个框架,通过学习具有变化率的模型所观察到的动态条件的概率分布,并显示不同样本的可靠性。

相关内容

MoDELS

关注 43

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【DeepMind】基于模型的强化学习，174页ppt，Model-Based Reinforcement Learning

专知会员服务

89+阅读 · 2021年1月12日

可解释强化学习，Explainable Reinforcement Learning: A Survey

专知会员服务

131+阅读 · 2020年5月14日

【基于模型的强化学习的博弈论框架】A Game Theoretic Framework for Model Based Reinforcement Learning

专知会员服务

131+阅读 · 2020年4月19日

【CVPR2020-杭州电子科技大学】软化相似性学习的无监督行人重识别，Unsupervised Person Re-identification via Softened Similarity Learning

专知会员服务

23+阅读 · 2020年4月8日