Several goal-oriented problems in the real-world can be naturally expressed as Stochastic Shortest Path Problems (SSPs). However, a key difficulty for computing solutions for problems in the SSP framework is that the computational requirements often make finding solutions to even moderately sized problems intractable. Solutions to many of such problems can often be expressed as generalized policies that are quite easy to compute from small examples and are readily applicable to problems with a larger number of objects and/or different object names. In this paper, we provide a propose an approach that uses canonical abstractions to compute such generalized policies and represent them as AND-OR graphs that translate to simple non-deterministic, memoryless controllers. Such policy structures naturally lend themselves to a hierarchical approach for solving problems and we show that our approach can be embedded in any SSP solver to compute hierarchically optimal policies. We conducted an empirical evaluation on some well-known planning benchmarks and difficult robotics domains and show that our approach is promising, often computing optimal policies significantly faster than state-of-art SSP solvers.
翻译:现实世界中一些面向目标的问题可以自然地以 " 最短路径问题 " (SSP)来表达。然而,计算SSP框架中问题解决方案的一个关键困难是,计算要求往往导致找到解决即使是中等规模的棘手问题的办法。许多这类问题的解决方案往往可以表现为普遍政策,从小例子中很容易地计算出来,并且很容易适用于涉及更多物体和(或)不同物体名称的问题。在本文中,我们提出了一个方法,用粗略的抽象概念来计算这类普遍政策,并把它们作为可转化为简单的非决定性的、没有记忆的控制器的和/或图表来代表。这种政策结构自然有利于采取分级方法解决问题,我们表明我们的方法可以植根于任何SSP解决方案的解决方案中,以便按等级来计算最佳的政策。我们对某些众所周知的规划基准和困难的机器人领域进行了经验评估,并表明我们的方法很有希望,而且往往比SSP解决方案的州级解决器更快速地计算最佳政策。