We introduce a visually-guided and physics-driven task-and-motion planning benchmark, which we call the ThreeDWorld Transport Challenge. In this challenge, an embodied agent equipped with two 9-DOF articulated arms is spawned randomly in a simulated physical home environment. The agent is required to find a small set of objects scattered around the house, pick them up, and transport them to a desired final location. We also position containers around the house that can be used as tools to assist with transporting objects efficiently. To complete the task, an embodied agent must plan a sequence of actions to change the state of a large number of objects in the face of realistic physical constraints. We build this benchmark challenge using the ThreeDWorld simulation: a virtual 3D environment where all objects respond to physics, and where can be controlled using fully physics-driven navigation and interaction API. We evaluate several existing agents on this benchmark. Experimental results suggest that: 1) a pure RL model struggles on this challenge; 2) hierarchical planning-based agents can transport some objects but still far from solving this task. We anticipate that this benchmark will empower researchers to develop more intelligent physics-driven robots for the physical world.
翻译:我们引入了一个视觉引导和物理驱动的任务和动作规划基准,我们称之为三维世界运输挑战。在这一挑战中,一个装有两枚9DOF的装有装有两枚9DOF分解武器的试剂,在模拟的家居环境中随机生成。该试剂需要找到分散在房屋周围的小块物体,将其捡起并运输到理想的最后位置。我们还将集装箱放置在房屋周围,这些容器可以用作协助有效运输物体的工具。为了完成任务,一个装有装有装饰剂的代理人必须计划一系列行动,以改变在现实的物理制约下大量物体的状况。我们利用三DWorld模拟来构建这一基准挑战:一个虚拟的3D环境,所有物体都对物理学作出反应,并且可以使用完全物理学驱动的导航和反光学研究所来加以控制。我们评估了这个基准上的若干现有物剂。实验结果表明:1)一个用于应对这一挑战的纯RL模型;2)基于等级的规划剂可以运输一些物体,但距离完成这项任务还很远。我们预计,这一基准将使研究人员有能力为物理世界开发更智能的物理学驱动机器人。