Decision-making is challenging in robotics environments with continuous object-centric states, continuous actions, long horizons, and sparse feedback. Hierarchical approaches, such as task and motion planning (TAMP), address these challenges by decomposing decision-making into two or more levels of abstraction. In a setting where demonstrations and symbolic predicates are given, prior work has shown how to learn symbolic operators and neural samplers for TAMP with manually designed parameterized policies. Our main contribution is a method for learning parameterized polices in combination with operators and samplers. These components are packaged into modular neuro-symbolic skills and sequenced together with search-then-sample TAMP to solve new tasks. In experiments in four robotics domains, we show that our approach -- bilevel planning with neuro-symbolic skills -- can solve a wide range of tasks with varying initial states, goals, and objects, outperforming six baselines and ablations. Video: https://youtu.be/PbFZP8rPuGg Code: https://tinyurl.com/skill-learning
翻译:在连续的以物体为中心的状态、连续的行动、远距和零星反馈的机器人环境中,决策具有挑战性。任务和运动规划(TAMP)等等级方法通过将决策分解成两个或两个以上抽象层次来应对这些挑战。在进行演示和象征性上游的环境下,先前的工作表明如何用人工设计的参数化政策为TAMP学习象征性操作员和神经样本。我们的主要贡献是同操作员和取样员一起学习参数化的策略的方法。这些组件被组合成模块型神经同步技能,并和搜索时模模 TAMP一起排序,以解决新任务。在四个机器人领域的实验中,我们展示了我们的方法 -- -- 使用神经同步技能的双级规划 -- -- 能够解决不同初始状态、目标和对象、优于六个基准和基准的广泛任务。视频:https://youtu.be/PbFZP8rPuG代码:https://tinyurl.com/skill-leglease: