Object Goal Navigation requires a robot to find and navigate to an instance of a target object class in a previously unseen environment. Our framework incrementally builds a semantic map of the environment over time, and then repeatedly selects a long-term goal ('where to go') based on the semantic map to locate the target object instance. Long-term goal selection is formulated as a vision-based deep reinforcement learning problem. Specifically, an Encoder Network is trained to extract high-level features from a semantic map and select a long-term goal. In addition, we incorporate data augmentation and Q-function regularization to make the long-term goal selection more effective. We report experimental results using the photo-realistic Gibson benchmark dataset in the AI Habitat 3D simulation environment to demonstrate substantial performance improvement on standard measures in comparison with a state of the art data-driven baseline.
翻译:对象导航要求机器人在先前不为人知的环境中查找和浏览目标对象类实例。 我们的框架会逐步建立一段时间里环境的语义图, 然后根据语义图反复选择一个长期目标( “ 去哪里去 ” ), 以定位目标对象实例。 长期目标选择是一个基于愿景的深强化学习问题。 具体地说, 编码器网络会接受培训, 从语义图中提取高层次的特征, 并选择一个长期目标。 此外, 我们加入数据增强和功能规范化, 以使长期目标选择更加有效。 我们报告实验结果时使用了AI HID 3D 模拟环境中的光现实吉布基准数据集, 以显示与艺术数据驱动基线状态相比标准措施的显著绩效改进 。