Differentiable planning promises end-to-end differentiability and adaptivity. However, an issue prevents it from scaling up to larger-scale problems: they need to differentiate through forward iteration layers to compute gradients, which couples forward computation and backpropagation, and needs to balance forward planner performance and computational cost of the backward pass. To alleviate this issue, we propose to differentiate through the Bellman fixed-point equation to decouple forward and backward passes for Value Iteration Network and its variants, which enables constant backward cost (in planning horizon) and flexible forward budget and helps scale up to large tasks. We study the convergence stability, scalability, and efficiency of the proposed implicit version of VIN and its variants and demonstrate their superiorities on a range of planning tasks: 2D navigation, visual navigation, and 2-DOF manipulation in configuration space and workspace.
翻译:可微规划承诺实现端到端的可微化和适应性。然而,面临一个问题阻碍其扩展至更大规模的问题:需要通过遍历前馈网络来计算梯度,这使得前向计算与反向传播相耦合,并需要平衡前向规划器性能和反向传递的计算成本。为了解决这个问题,我们建议通过贝尔曼方程来完成微分,从而将价值迭代网络和其变种的前向传递与反向传递分离,实现了恒定的反向传播成本(在规划时间范围内),灵活的前向传递预算,帮助扩展到大规模任务。我们对所提出的隐式并行版本的VIN及其变种的收敛稳定性、可扩展性和效率进行了研究,并展示了其在一系列规划任务中的优越性:2D导航、视觉导航和配置空间和工作空间中的2-DOF操作。