The goal of object navigation is to reach the expected objects according to visual information in the unseen environments. Previous works usually implement deep models to train an agent to predict actions in real-time. However, in the unseen environment, when the target object is not in egocentric view, the agent may not be able to make wise decisions due to the lack of guidance. In this paper, we propose a hierarchical object-to-zone (HOZ) graph to guide the agent in a coarse-to-fine manner, and an online-learning mechanism is also proposed to update HOZ according to the real-time observation in new environments. In particular, the HOZ graph is composed of scene nodes, zone nodes and object nodes. With the pre-learned HOZ graph, the real-time observation and the target goal, the agent can constantly plan an optimal path from zone to zone. In the estimated path, the next potential zone is regarded as sub-goal, which is also fed into the deep reinforcement learning model for action prediction. Our methods are evaluated on the AI2-Thor simulator. In addition to widely used evaluation metrics SR and SPL, we also propose a new evaluation metric of SAE that focuses on the effective action rate. Experimental results demonstrate the effectiveness and efficiency of our proposed method.
翻译:对象导航的目标是根据看不见环境中的视觉信息达到预期对象。 先前的工程通常使用深层模型来训练一个代理器, 以便实时预测行动。 但是, 在看不见的环境中, 当目标对象不是以自我为中心的视图时, 代理器可能由于缺乏指导而无法做出明智的决定 。 在本文中, 我们建议使用一个分级物体对区域( HOZ) 图表来引导该代理器, 以粗略到平坦的方式引导该代理器, 并且还提议建立一个在线学习机制, 以便根据新环境中的实时观测更新 HOZ 。 特别是 HOZ 图形由景点节点、 区域节点和对象节点组成 。 在先前的 HOZ 图形、 实时观测和目标目标中, 该代理器可能无法不断规划出一条从带到带的最佳路径。 在估计路径中, 下一个潜在区域被视为次级目标, 同时也将之反馈到用于行动预测的深度强化学习模型。 我们的方法是在 AI2- Thor imulator 上评估 。 除了广泛使用的测量指标 SR 和 对象节点 和 目标节点 外, 我们还提出了一个有效的实验方法 的新的评估 。