Dynamic state representation learning is an important task in robot learning. Latent space that can capture dynamics related information has wide application in areas such as accelerating model free reinforcement learning, closing the simulation to reality gap, as well as reducing the motion planning complexity. However, current dynamic state representation learning methods scale poorly on complex dynamic systems such as deformable objects, and cannot directly embed well defined simulation function into the training pipeline. We propose DiffSRL, a dynamic state representation learning pipeline utilizing differentiable simulation that can embed complex dynamics models as part of the end-to-end training. We also integrate differentiable dynamic constraints as part of the pipeline which provide incentives for the latent state to be aware of dynamical constraints. We further establish a state representation learning benchmark on a soft-body simulation system, PlasticineLab, and our model demonstrates superior performance in terms of capturing long-term dynamics as well as reward prediction.
翻译:动态州代表制学习是机器人学习的一项重要任务。 能够捕捉动态相关信息的远程空间在加速模型免费强化学习、缩小模拟到现实差距以及降低运动规划复杂性等领域广泛应用。 但是,当前动态州代表制学习方法在变形物体等复杂动态系统中规模不强,无法直接将明确界定的模拟功能嵌入培训管道。 我们提议DiffSRL,这是一个动态州代表制学习管道,利用不同模拟,将复杂动态模型嵌入到终端到终端培训中。 我们还整合了不同动态动态制约因素,作为管道的一部分,为潜伏状态了解动态制约因素提供了激励。 我们还进一步建立了软体模拟系统(可塑胶拉布)的州代表制学习基准,以及我们的模型显示在捕捉长期动态和奖励预测方面的优异性表现。