Deep imitation learning is promising for solving dexterous manipulation tasks because it does not require an environment model and pre-programmed robot behavior. However, its application to dual-arm manipulation tasks remains challenging. In a dual-arm manipulation setup, the increased number of state dimensions caused by the additional robot manipulators causes distractions and results in poor performance of the neural networks. We address this issue using a self-attention mechanism that computes dependencies between elements in a sequential input and focuses on important elements. A Transformer, a variant of self-attention architecture, is applied to deep imitation learning to solve dual-arm manipulation tasks in the real world. The proposed method has been tested on dual-arm manipulation tasks using a real robot. The experimental results demonstrated that the Transformer-based deep imitation learning architecture can attend to the important features among the sensory inputs, therefore reducing distractions and improving manipulation performance when compared with the baseline architecture without the self-attention mechanisms.
翻译:深模仿学习对于解决极具挑战性的操纵任务很有希望,因为它不需要环境模型和预先编程的机器人行为。然而,它在双重武器操纵任务中的应用仍然具有挑战性。在双武器操纵装置中,由更多机器人操纵器造成的更多国家层面导致分散注意力,导致神经网络运行不良。我们使用一个自我注意机制来解决这个问题,该机制计算顺序输入中各元素之间的依赖性,并侧重于重要元素。一个变形器,即自我注意结构的变型,用于深模仿学习,以解决现实世界中的双重武器操纵任务。拟议方法已经用真正的机器人对双武器操纵任务进行了测试。实验结果显示,基于变形器的深模仿学习结构能够处理感官投入中的重要特征,从而减少干扰,并在与没有自控机制的基线结构相比,改进操纵性能。