使用深强化学习来提取和放置子任务,实现等级任务分解 (Towards Hierarchical Task Decomposition using Deep Reinforcement Learning for Pick and Place Subtasks)

Robotic automation for pick and place task has vast applications. Deep Reinforcement Learning (DRL) is one of the leading robotic automation technique that has been able to achieve dexterous manipulation and locomotion robotics skills. However, a major drawback of using DRL is the Data hungry training regime of DRL that requires millions of trial and error attempts, impractical in real robotic hardware. We propose a multi-subtask reinforcement learning method where complex tasks can be decomposed into low-level subtasks. These subtasks can be parametrised as expert networks and learnt via existing DRL methods. The trained subtasks can be choreographed by a high-level synthesizer. As a test bed, we use a pick and place robotic simulator, and transfer the learnt behaviour in a real robotic system. We show that our method outperforms imitation learning based method and reaches high success rate compared to an end-to-end learning approach. Furthermore, we investigate the trained subtasks to demonstrate a adaptive behaviour by fine-tuning a subset of subtasks on a different task. Our approach deviates from the end-to-end learning strategy and provide an initial direction towards learning modular task representations that can generate robust behaviours.

翻译：用于选取和位置任务的机器人自动化应用范围很广。深强化学习(DRL)是领先的机器人自动化技术之一,能够实现极速操纵和移动机器人技能。然而,使用DRL的一个主要缺点是DRL的数据饥饿培训制度,DRL需要数以百万计的尝试和错误尝试,在真正的机器人硬件中是不切实际的。我们提出了一个多子任务强化学习方法,其中复杂的任务可以分解成低层次的子任务。这些子任务可以作为专家网络进行假相,并通过现有的DRL方法学习。受过训练的子任务可由高级合成人进行编织。作为测试床,我们使用一个选取和安装机器人模拟器,并将学到的行为转移到真正的机器人系统。我们表明,我们的方法超越了模拟学习方法,并且达到与端到端学习方法相比的高成功率。此外,我们调查经过训练的子任务,通过对不同任务上的一个子任务分组进行微调来显示适应行为。我们的方法可以向不同的初始学习方向偏离学习模式。我们的方法可以提供强有力的学习方向。

相关内容

Automator

关注 5

Automator是苹果公司为他们的Mac OS X系统开发的一款软件。 只要通过点击拖拽鼠标等操作就可以将一系列动作组合成一个工作流，从而帮助你自动的（可重复的）完成一些复杂的工作。Automator还能横跨很多不同种类的程序，包括：查找器、Safari网络浏览器、iCal、地址簿或者其他的一些程序。它还能和一些第三方的程序一起工作，如微软的Office、Adobe公司的Photoshop或者Pixelmator等。

【CVPR2020】视觉导航的神经拓扑SLAM，56页ppt，Neural Topological SLAM for Visual Navigation

专知会员服务

14+阅读 · 2020年6月18日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

【牛津大学】深度残差强化学习，Deep Residual Reinforcement Learning

专知会员服务

84+阅读 · 2020年2月18日

深度学习金融应用综述论文，52页pdf，Deep Learning for Financial Applications

专知会员服务

83+阅读 · 2020年2月18日