We study planning problems faced by robots operating in uncertain environments with incomplete knowledge of state, and actions that are noisy and/or imprecise. This paper identifies a new problem sub-class that models settings in which information is revealed only intermittently through some exogenous process that provides state information periodically. Several practical domains fit this model, including the specific scenario that motivates our research: autonomous navigation of a planetary exploration rover augmented by remote imaging. With an eye to efficient specialized solution methods, we examine the structure of instances of this sub-class. They lead to Markov Decision Processes with exponentially large action-spaces but for which, as those actions comprise sequences of more atomic elements, one may establish performance bounds by comparing policies under different information assumptions. This provides a way in which to construct performance bounds systematically. Such bounds are useful because, in conjunction with the insights they confer, they can be employed in bounding-based methods to obtain high-quality solutions efficiently; the empirical results we present demonstrate their effectiveness for the considered problems. The foregoing has also alluded to the distinctive role that time plays for these problems -- more specifically: time until information is revealed -- and we uncover and discuss several interesting subtleties in this regard.
翻译:我们研究了在不完全了解状态和/或不精确的不确定环境中操作的机器人所面临的规划问题,以及杂乱和/或不精确的行动。本文件确定了一个新的问题子类,即模型设置中信息仅通过定期提供状态信息的某种外部程序间歇性地披露。若干实际领域适合这一模型,包括激励我们研究的具体情景:由远程成像增强的行星探索环流的自主导航;着眼于高效的专门解决方案,我们审视了这一子类实例的结构。它们导致马尔科夫决策进程,其行动空间大得惊人,但是,由于这些行动包括更多原子元素的序列,人们可以通过比较不同信息假设下的政策来确定性能界限。这为系统地构建性能界限提供了一种方法。这些界限是有用的,因为结合它们所赋予的洞察力,它们可以被用于约束性方法,以便有效地获得高质量的解决方案;我们介绍的经验结果表明它们对于所考虑的问题的有效性。上述还暗示了时间在这些问题上的独特作用 -- 更具体地说:在信息被揭示之前的时间里,我们发现并讨论其中的一些微妙之处。