We present hierarchical policy blending as optimal transport (HiPBOT). This hierarchical framework adapts the weights of low-level reactive expert policies, adding a look-ahead planning layer on the parameter space of a product of expert policies and agents. Our high-level planner realizes a policy blending via unbalanced optimal transport, consolidating the scaling of underlying Riemannian motion policies, effectively adjusting their Riemannian matrix, and deciding over the priorities between experts and agents, guaranteeing safety and task success. Our experimental results in a range of application scenarios from low-dimensional navigation to high-dimensional whole-body control showcase the efficacy and efficiency of HiPBOT, which outperforms state-of-the-art baselines that either perform probabilistic inference or define a tree structure of experts, paving the way for new applications of optimal transport to robot control. More material at https://sites.google.com/view/hipobot
翻译:我们提出等级政策混合为最佳运输(HiPBOT)。这一等级框架调整了低水平反应专家政策的权重,在专家政策和代理人的产物的参数空间上增加了一个外观式规划层。我们的高层规划员通过不平衡的最佳运输实现政策混合,巩固了里曼尼运动的基本政策规模,有效地调整了里曼尼运动矩阵,决定了专家和代理人之间的优先事项,保证了安全和任务的成功。我们在从低维导航到高维全体控制的一系列应用情景中的实验结果展示了HIPBOT的功效和效率,它超越了最先进的基线,要么是进行概率推论,要么是确定专家的树结构,为将最佳运输用于机器人控制的新应用铺平了道路。在 https://sites.gogle.com/view/hipoolt的更多资料。</s>