Enabling robots to solve multiple manipulation tasks has a wide range of industrial applications. While learning-based approaches enjoy flexibility and generalizability, scaling these approaches to solve such compositional tasks remains a challenge. In this work, we aim to solve multi-task learning through the lens of sequence-conditioning and weighted sampling. First, we propose a new suite of benchmark specifically aimed at compositional tasks, MultiRavens, which allows defining custom task combinations through task modules that are inspired by industrial tasks and exemplify the difficulties in vision-based learning and planning methods. Second, we propose a vision-based end-to-end system architecture, Sequence-Conditioned Transporter Networks, which augments Goal-Conditioned Transporter Networks with sequence-conditioning and weighted sampling and can efficiently learn to solve multi-task long horizon problems. Our analysis suggests that not only the new framework significantly improves pick-and-place performance on novel 10 multi-task benchmark problems, but also the multi-task learning with weighted sampling can vastly improve learning and agent performances on individual tasks.
翻译:使机器人能够解决多重操纵任务具有广泛的工业应用。 虽然以学习为基础的方法具有灵活性和普遍性,但推广这些方法以解决这种构成任务仍然是一个挑战。 在这项工作中,我们的目标是通过序列调节和加权抽样的镜头解决多任务学习问题。首先,我们提出了一套新的基准,具体针对组成任务,即多拉芬,它允许通过由工业任务启发的任务模块确定定制任务组合,并举例说明基于愿景的学习和规划方法的困难。第二,我们提出了基于愿景的端到端系统结构,即按顺序和加权抽样加强目标的有条件运输网络,并能够有效地学习解决多任务的长期问题。我们的分析表明,新框架不仅极大地改进了新颖的10项多任务基准问题的选用和地点性能,而且通过加权抽样进行的多任务学习可以极大地改善个人任务的学习和代理性能。