Achieving highly accurate dynamic or simulator models that are close to the real robot can facilitate model-based controls (e.g., model predictive control or linear-quadradic regulators), model-based trajectory planning (e.g., trajectory optimization), and decrease the amount of learning time necessary for reinforcement learning methods. Thus, the objective of this work is to learn the residual errors between a dynamic and/or simulator model and the real robot. This is achieved using a neural network, where the parameters of a neural network are updated through an Unscented Kalman Filter (UKF) formulation. Using this method, we model these residual errors with only small amounts of data -- a necessity as we improve the simulator/dynamic model by learning directly from real-world operation. We demonstrate our method on robotic hardware (e.g., manipulator arm, and a wheeled robot), and show that with the learned residual errors, we can further close the reality gap between dynamic models, simulations, and actual hardware.
翻译:实现与真正的机器人相近的高度精确的动态或模拟模型,可以促进基于模型的控制(例如模型预测控制或线性二次调节器),基于模型的轨迹规划(例如轨道优化),并减少强化学习方法所需的学习时间。因此,这项工作的目标是学习动态和/或模拟模型与真正的机器人之间的剩余错误。这是利用神经网络实现的,神经网络的参数通过不鼓励的Kalman过滤器(UKF)的配方来更新。我们使用这种方法,用少量数据模拟这些残余错误 -- -- 因为我们通过直接从现实世界的操作中学习来改进模拟/动态模型。我们用机器人硬件(例如操纵器臂和轮式机器人)来展示我们的方法,并显示通过所学的残余错误,我们可以进一步缩小动态模型、模拟和实际硬件之间的现实差距。