探索梦想:自治系统的适应性模拟 (Dream to Explore: Adaptive Simulations for Autonomous Systems)

One's ability to learn a generative model of the world without supervision depends on the extent to which one can construct abstract knowledge representations that generalize across experiences. To this end, capturing an accurate statistical structure from observational data provides useful inductive biases that can be transferred to novel environments. Here, we tackle the problem of learning to control dynamical systems by applying Bayesian nonparametric methods, which is applied to solve visual servoing tasks. This is accomplished by first learning a state space representation, then inferring environmental dynamics and improving the policies through imagined future trajectories. Bayesian nonparametric models provide automatic model adaptation, which not only combats underfitting and overfitting, but also allows the model's unbounded dimension to be both flexible and computationally tractable. By employing Gaussian processes to discover latent world dynamics, we mitigate common data efficiency issues observed in reinforcement learning and avoid introducing explicit model bias by describing the system's dynamics. Our algorithm jointly learns a world model and policy by optimizing a variational lower bound of a log-likelihood with respect to the expected free energy minimization objective function. Finally, we compare the performance of our model with the state-of-the-art alternatives for continuous control tasks in simulated environments.

翻译：学习一种没有监督的世界基因模型的能力取决于一个人能够在多大程度上建立能够概括各种经验的抽象知识代表。为此,从观测数据中获取准确的统计结构提供了有用的感化偏差,可以转移到新环境。在这里,我们通过应用贝叶西亚的非参数性方法来应对学习控制动态系统的问题,这些方法用于解决视觉思维任务。这是通过首先学习国家空间代表,然后通过想象的未来轨迹推断环境动态和改进政策来实现的。巴伊西亚非参数性模型提供自动模型适应,不仅打击不适应和过度适应,而且还允许该模型的无限制的维度既灵活又可按算。我们利用高布西亚进程来发现潜伏的世界动态,从而减轻在强化学习过程中观察到的共同数据效率问题,并避免通过描述系统动态来引入明确的模型偏差。我们的算法共同学习世界模型和政策,方法是优化一个与预期的自由能源最小化目标功能相比的变差的低比值约束。最后,我们将模型的绩效与连续的模拟环境对比。

相关内容

MoDELS

关注 43

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日