In the peg insertion task, human pays attention to the seam between the peg and the hole and tries to fill it continuously with visual feedback. By imitating the human behavior, we design architectures with position and orientation estimators based on the seam representation for pose alignment, which proves to be general to the unseen peg geometries. By putting the estimators into the closed-loop control with reinforcement learning, we further achieve higher or comparable success rate, efficiency, and robustness compared with the baseline methods. The policy is trained totally in simulation without any manual intervention. To achieve sim- to-real, a learnable segmentation module with automatic data collecting and labeling can be easily trained to decouple the perception and the policy, which helps the model trained in simulation quickly adapting to the real world with negligible effort. Results are presented in simulation and on a physical robot. Code, videos, and supplemental material are available at https://github.com/xieliang555/SFN.git
翻译:在连接插入任务中,人类关注钉和洞之间的接缝,并试图通过视觉反馈不断填充它。通过模仿人类的行为,我们设计了以接缝代表为主的位置和方向估计器结构,这种结构与隐蔽比对立是普通的。通过将测算器放入闭环控制系统,并进行强化学习,我们进一步取得了与基线方法相比更高或可比的成功率、效率和稳健性。该政策在模拟中完全受过模拟培训,没有任何人工干预。为了实现模拟到真实,一个具有自动数据收集和标签的可学习分解模块很容易被训练为分解概念和政策,这有助于模拟快速适应真实世界的模型,且努力微不足道。模拟和物理机器人方面的结果,可在https://github.com/xieang555/SFNF.git查阅代码、视频和补充材料。