With the rapid growth of computing powers and recent advances in deep learning, we have witnessed impressive demonstrations of novel robot capabilities in research settings. Nonetheless, these learning systems exhibit brittle generalization and require excessive training data for practical tasks. To harness the capabilities of state-of-the-art robot learning models while embracing their imperfections, we present Sirius, a principled framework for humans and robots to collaborate through a division of work. In this framework, partially autonomous robots are tasked with handling a major portion of decision-making where they work reliably; meanwhile, human operators monitor the process and intervene in challenging situations. Such a human-robot team ensures safe deployments in complex tasks. Further, we introduce a new learning algorithm to improve the policy's performance on the data collected from the task executions. The core idea is re-weighing training samples with approximated human trust and optimizing the policies with weighted behavioral cloning. We evaluate Sirius in simulation and on real hardware, showing that Sirius consistently outperforms baselines over a collection of contact-rich manipulation tasks, achieving an 8% boost in simulation and 27% on real hardware than the state-of-the-art methods, with twice faster convergence and 85% memory size reduction. Videos and code are available at https://ut-austin-rpl.github.io/sirius/
翻译:随着计算能力的迅速增长和最近深层次学习的进展,我们目睹了在研究环境中新机器人能力令人印象深刻的展示。然而,这些学习系统展示了简单化,要求为实际任务提供过多的培训数据。为了利用最先进的机器人学习模型的能力,同时接受其不完善之处,我们提出天狼星,这是人类和机器人通过分工进行合作的原则框架。在这个框架内,部分自主的机器人的任务是在他们可靠工作的地方处理大部分决策;与此同时,人类操作者监测过程并干预具有挑战性的情况。这样的人类机器人团队确保了复杂任务的安全部署。此外,我们引入了新的学习算法,以提高从任务处决中收集的数据的政策性能。核心想法是用近似人类信任重新组合培训样本,并优化加权行为克隆的政策。我们在模拟和真实硬件方面对天狼星进行评估,表明天狼星在收集接触力丰富的操作任务方面始终超越基线,在模拟中实现了8%的提升,在实际硬件上实现了27%的提升,在85号的存储/存储速度上比85号更快。