Differentiable planning promises end-to-end differentiability and adaptivity. However, an issue prevents it from scaling up to larger-scale problems: they need to differentiate through forward iteration layers to compute gradients, which couples forward computation and backpropagation, and needs to balance forward planner performance and computational cost of the backward pass. To alleviate this issue, we propose to differentiate through the Bellman fixed-point equation to decouple forward and backward passes for Value Iteration Network and its variants, which enables constant backward cost (in planning horizon) and flexible forward budget and helps scale up to large tasks. We study the convergence stability, scalability, and efficiency of the proposed implicit version of VIN and its variants and demonstrate their superiorities on a range of planning tasks: 2D navigation, visual navigation, and 2-DOF manipulation in configuration space and workspace.
翻译:然而,一个问题使得它无法扩大到更大的问题:它们需要通过前置迭代层进行区分,以计算梯度,这些梯度是双向向前计算和反向调整,并且需要平衡前排规划员的性能和后传的计算成本。为了缓解这一问题,我们提议通过贝尔曼固定点方程式进行区分,将前向和后向通道脱钩,用于价值循环网络及其变式,从而能够使常态后向成本(规划地平线)和灵活的前向预算,并有助于向大任务扩展。我们研究VIN及其变式的拟议隐含版本的趋同稳定性、可伸缩性和效率,并展示其在一系列规划任务上的优越性:2D导航、视觉导航和配置空间和工作空间的2DOF操纵。