We explore the problem of estimating the mass distribution of an articulated object by an interactive robotic agent. Our method predicts the mass distribution of an object by using the limited sensing and actuating capabilities of a robotic agent that is interacting with the object. We are inspired by the role of exploratory play in human infants. We take the combined approach of supervised and reinforcement learning to train an agent that learns to strategically interact with the object to estimate the object's mass distribution. Our method consists of two neural networks: (i) the policy network which decides how to interact with the object, and (ii) the predictor network that estimates the mass distribution given a history of observations and interactions. Using our method, we train a robotic arm to estimate the mass distribution of an object with moving parts (e.g. an articulated rigid body system) by pushing it on a surface with unknown friction properties. We also demonstrate how our training from simulations can be transferred to real hardware using a small amount of real-world data for fine-tuning. We use a UR10 robot to interact with 3D printed articulated chains with varying mass distributions and show that our method significantly outperforms the baseline system that uses random pushes to interact with the object.
翻译:我们探索了通过交互式机器人代理体估计一个分解物体的质量分布问题。 我们的方法通过使用与该物体互动的机器人代理体的有限感知和感应能力来预测物体的质量分布。 我们受到在人类婴儿中探索性游戏的作用的启发。 我们采用监督和强化学习相结合的方法,训练一个学会与该物体的战略互动以估计该物体的质量分布的代理人。 我们的方法包括两个神经网络:(一) 决定如何与该物体互动的政策网络,以及(二) 根据观测和相互作用的历史来估计质量分布的预测网络。 我们用我们的方法,训练一个机器人臂来估计一个带有移动部件的物体的质量分布(例如,一个清晰的僵硬体系统),方法是用未知摩擦特性将它推到表面。 我们还演示了如何利用少量真实世界数据将我们的模拟训练转移到真正的硬件进行微调。 我们使用一个UR10机器人与3D印刷的分解的分解的物体分布链进行互动,并显示我们的方法大大超出了我们使用随机的基底系统进行互动的方法。