Achieving highly accurate kinematic or simulator models that are close to the real robot can facilitate model-based controls (e.g., model predictive control or linear-quadradic regulators), model-based trajectory planning (e.g., trajectory optimization), and decrease the amount of learning time necessary for reinforcement learning methods. Thus, the objective of this work is to learn the residual errors between a kinematic and/or simulator model and the real robot. This is achieved using auto-tuning and neural networks, where the parameters of a neural network are updated using an auto-tuning method that applies equations from an Unscented Kalman Filter (UKF) formulation. Using this method, we model these residual errors with only small amounts of data - a necessity as we improve the simulator/kinematic model by learning directly from hardware operation. We demonstrate our method on robotic hardware (e.g., manipulator arm), and show that with the learned residual errors, we can further close the reality gap between kinematic models, simulations, and the real robot.
翻译:实现与真实机器人相近的高度精确的运动或模拟模型,可以促进基于模型的控制(例如模型预测控制或线性二次调节器),基于模型的轨迹规划(例如轨道优化),并减少强化学习方法所需的学习时间。因此,这项工作的目标是学习运动和/或模拟模型与真正的机器人之间的残余错误。这是利用自动调控和神经网络实现的。在这种网络中,神经网络的参数可以采用自动调控方法来更新,该方法应用来自未点的卡尔曼过滤器(UKF)的方程式。使用这种方法,我们用少量数据来模拟这些残余错误,这是我们通过直接从硬件操作中学习来改进模拟/皮肤模型的必要性。我们展示了机器人硬件的方法(例如操纵器臂),并表明,随着所学到的残余错误,我们可以进一步缩小运动模型、模拟和真实机器人之间的现实差距。