Model-based Reinforcement Learning and Control have demonstrated great potential in various sequential decision making problem domains, including in robotics settings. However, real-world robotics systems often present challenges that limit the applicability of those methods. In particular, we note two problems that jointly happen in many industrial systems: 1) Irregular/asynchronous observations and actions and 2) Dramatic changes in environment dynamics from an episode to another (e.g. varying payload inertial properties). We propose a general framework that overcomes those difficulties by meta-learning adaptive dynamics models for continuous-time prediction and control. We evaluate the proposed approach on a simulated industrial robot. Evaluations on real robotic systems will be added in future iterations of this pre-print.
翻译:以模型为基础的强化学习和控制在一系列相继决策问题领域,包括在机器人环境中,显示出巨大的潜力,然而,现实世界机器人系统往往带来挑战,限制了这些方法的适用性,我们特别注意到许多工业系统共同出现的两个问题:(1) 不定期/不同步的观测和行动;(2) 环境动态从一个事件到另一个事件的变化(例如不同有效载荷惯性特性);我们提出了一个总框架,通过元学习适应性动态模型来克服这些困难,以便进行连续时间预测和控制;我们评价模拟工业机器人的拟议办法;将在这一预印的未来版本中增加对实际机器人系统的评估。