Reasoning about uncertainty is vital in many real-life autonomous systems. However, current state-of-the-art planning algorithms cannot either reason about uncertainty explicitly, or do so with a high computational burden. Here, we focus on making informed decisions efficiently, using reward functions that explicitly deal with uncertainty. We formulate an approximation, namely an abstract observation model, that uses an aggregation scheme to alleviate computational costs. We derive bounds on the expected information-theoretic reward function and, as a consequence, on the value function. We then propose a method to refine aggregation to achieve identical action selection with a fraction of the computational time.
翻译:在许多现实的自主系统中,对不确定性的考虑至关重要。然而,目前最先进的规划算法既不能明确说明不确定性的理由,也不能以很高的计算负担来这样做。在这里,我们注重高效率地作出知情的决定,利用明确处理不确定性的奖励功能。我们制定近似值,即一个抽象的观察模型,利用汇总法来减轻计算成本。我们从预期的信息理论奖赏功能以及因此在价值函数上得到的界限中得益。我们然后提出一种方法来改进汇总,以便实现与计算时间的一小部分相同的行动选择。