Robots are increasingly expected to manipulate objects in ever more unstructured environments where the object properties have high perceptual uncertainty from any single sensory modality. This directly impacts successful object manipulation. In this work, we propose a reinforcement learning-based motion planning framework for object manipulation which makes use of both on-the-fly multisensory feedback and a learned attention-guided deep affordance model as perceptual states. The affordance model is learned from multiple sensory modalities, including vision and touch (tactile and force/torque), which is designed to predict and indicate the manipulable regions of multiple affordances (i.e., graspability and pushability) for objects with similar appearances but different intrinsic properties (e.g., mass distribution). A DQN-based deep reinforcement learning algorithm is then trained to select the optimal action for successful object manipulation. To validate the performance of the proposed framework, our method is evaluated and benchmarked using both an open dataset and our collected dataset. The results show that the proposed method and overall framework outperform existing methods and achieve better accuracy and higher efficiency.
翻译:人们越来越期望机器人在更不结构的环境中操纵物体,因为物体属性具有任何感官模式的高度概念不确定性。这直接影响了物体操纵的成功。在这项工作中,我们提议了一个基于学习的物体操纵运动规划框架,将现场多感反馈和一种有见识的、引人注意的深发发价模型作为视觉状态加以利用。 发牌模型是从多种感官模式中学习的,包括视觉和触摸(触觉和力/陶瓷),设计这些模式的目的是预测和显示多个可操作性(即可控性和可推性)区域,这些可操作性区域是外观相似但内在特性不同的物体(如质量分布)。然后,一个基于DQN的深度强化学习算法进行了培训,以选择成功物体操纵的最佳行动。为了验证拟议框架的性能,我们的方法是通过一个开放的数据集和我们收集的数据集来评估和设定基准。结果显示,拟议的方法和总体框架优于现有方法,并实现了更高的准确性和效率。