In this paper we focus on the problem of learning online an optimal policy for Active Visual Search (AVS) of objects in unknown indoor environments. We propose POMP++, a planning strategy that introduces a novel formulation on top of the classic Partially Observable Monte Carlo Planning (POMCP) framework, to allow training-free online policy learning in unknown environments. We present a new belief reinvigoration strategy which allows to use POMCP with a dynamically growing state space to address the online generation of the floor map. We evaluate our method on two public benchmark datasets, AVD that is acquired by real robotic platforms and Habitat ObjectNav that is rendered from real 3D scene scans, achieving the best success rate with an improvement of >10% over the state-of-the-art methods.
翻译:在本文中,我们着重探讨在网上学习在未知室内环境中积极视觉搜索物体的最佳政策的问题。我们提议POMP+++,这一规划战略在经典的《部分可观测的蒙特卡洛规划》(POMCP)框架之外引入新颖的提法,允许在未知环境中进行免费在线政策学习。我们提出了一个新的信仰振兴战略,允许利用POMCP以动态增长的州空间进行地面地图的在线生成。我们评估了我们关于两个公共基准数据集的方法,即由真正的机器人平台获得的AVD和从实际3D现场扫描中获得的HOM ObtNav, 取得了最佳的成功率,在最新方法上提高了 > 10%。