Representing and reasoning about uncertainty is crucial for autonomous agents acting in partially observable environments with noisy sensors. Partially observable Markov decision processes (POMDPs) serve as a general framework for representing problems in which uncertainty is an important factor. Online sample-based POMDP methods have emerged as efficient approaches to solving large POMDPs and have been shown to extend to continuous domains. However, these solutions struggle to find long-horizon plans in problems with significant uncertainty. Exploration heuristics can help guide planning, but many real-world settings contain significant task-irrelevant uncertainty that might distract from the task objective. In this paper, we propose STRUG, an online POMDP solver capable of handling domains that require long-horizon planning with significant task-relevant and task-irrelevant uncertainty. We demonstrate our solution on several temporally extended versions of toy POMDP problems as well as robotic manipulation of articulated objects using a neural perception frontend to construct a distribution of possible models. Our results show that STRUG outperforms the current sample-based online POMDP solvers on several tasks.
翻译:部分可见的Markov决定程序(POMDPs)是代表不确定性是一个重要因素的问题的一般框架。在线样本式的POMDP方法已成为解决大型POMDPs的有效方法,并被证明可以扩展到连续领域。然而,这些解决方案在面临严重不确定性的问题时难以找到长视距计划。探索超常学可以帮助指导规划,但许多真实世界的设置含有可能偏离任务目标的重大任务相关不确定性。在本文件中,我们提议STRUG,这是一个在线的POMDP解答器,能够处理需要长视距规划且任务相关且与任务相关的不确定性的域。我们展示了我们在若干时间延伸的Ty POMDP问题的解决方案,以及利用神经认知前端对表达的物体进行机器人操纵以构建可能模型的分布。我们的结果显示,STRUG超越了当前基于样本的在线POMDP解答器在几项任务上的运作。