渐进式混合规划与分立和连续行动 (Gradient-Based Mixed Planning with Discrete and Continuous Actions)

Dealing with planning problems with both discrete logical relations and continuous numeric changes in real-world dynamic environments is challenging. Existing numeric planning systems for the problem often discretize numeric variables or impose convex quadratic constraints on numeric variables, which harms the performance when solving the problem. In this paper, we propose a novel algorithm framework to solve the numeric planning problems mixed with discrete and continuous actions based on gradient descent. We cast the numeric planning with discrete and continuous actions as an optimization problem by integrating a heuristic function based on discrete effects. Specifically, we propose a gradient-based framework to simultaneously optimize continuous parameters and actions of candidate plans. The framework is combined with a heuristic module to estimate the best plan candidate to transit initial state to the goal based on relaxation. We repeatedly update numeric parameters and compute candidate plan until it converges to a valid plan to the planning problem. In the empirical study, we exhibit that our algorithm framework is both effective and efficient, especially when solving non-convex planning problems.

翻译：处理与离散逻辑关系和现实世界动态环境中连续数字变化有关的规划问题具有挑战性。问题的现有数字规划系统往往将数字变量分解,或对数字变量施加二次二次限制,这在解决问题时会损害性能。在本文件中,我们提出了一个新的算法框架,以解决数字规划问题,其中结合以梯度下降为基础的离散连续行动。我们把数字规划与离散和连续行动作为一个优化问题,方法是结合基于离散效应的超常功能。具体地说,我们提出了一个基于梯度的框架,以同时优化候选计划的连续参数和行动。这个框架与一个超常模块相结合,用以估计最佳的候选计划,将初始状态转换到基于放松的目标。我们反复更新数字参数,并计算候选人计划,直到它与规划问题的有效计划趋于一致。在经验研究中,我们证明我们的算法框架是有效和高效的,特别是在解决非convex规划问题时。

相关内容

Continuity

关注 4

让 iOS 8 和 OS X Yosemite 无缝切换的一个新特性。 > Apple products have always been designed to work together beautifully. But now they may really surprise you. With iOS 8 and OS X Yosemite, you’ll be able to do more wonderful things than ever before.

Source: Apple - iOS 8

【DeepMind】基于模型的强化学习，174页ppt，Model-Based Reinforcement Learning

专知会员服务

89+阅读 · 2021年1月12日