Deploying robots in real-world domains, such as households and flexible manufacturing lines, requires the robots to be taskable on demand. Linear temporal logic (LTL) is a widely-used specification language with a compositional grammar that naturally induces commonalities across tasks. However, the majority of prior research on reinforcement learning with LTL specifications treats every new formula independently. We propose LTL-Transfer, a novel algorithm that enables subpolicy reuse across tasks by segmenting policies for training tasks into portable transition-centric skills capable of satisfying a wide array of unseen LTL specifications while respecting safety-critical constraints. Our experiments in a Minecraft-inspired domain demonstrate the capability of LTL-Transfer to satisfy over 90% of 500 unseen tasks while training on only 50 task specifications and never violating a safety constraint. We also deployed LTL-Transfer on a quadruped mobile manipulator in a household environment to show its ability to transfer to many fetch and delivery tasks in a zero-shot fashion.
翻译:在现实世界域部署机器人,例如住户和灵活的制造线,要求机器人根据需求负责。线性时间逻辑(LTL)是一种广泛使用的规格语言,其组成语法自然会在不同任务之间产生共性。然而,大多数以前关于使用LTL规格强化学习的研究独立地对待每一种新公式。我们提议了LTL- Transfer,这是一种新算法,通过将培训任务的政策分成可移植的过渡中心技能,使次级政策能够再利用,从而能够满足一系列看不见的LTL规格,同时尊重安全方面的限制。我们在受地雷影响的领域进行的实验显示LTL-Transfer有能力满足500项未见任务中的90%以上,同时仅就50项任务规格进行培训,而且从未违反安全方面的限制。我们还在家庭环境中将LTL-Transfer部署在一个四重的移动操纵器上,以零发式方式向许多取货和交付任务的能力。