How do humans navigate to target objects in novel scenes? Do we use the semantic/functional priors we have built over years to efficiently search and navigate? For example, to search for mugs, we search cabinets near the coffee machine and for fruits we try the fridge. In this work, we focus on incorporating semantic priors in the task of semantic navigation. We propose to use Graph Convolutional Networks for incorporating the prior knowledge into a deep reinforcement learning framework. The agent uses the features from the knowledge graph to predict the actions. For evaluation, we use the AI2-THOR framework. Our experiments show how semantic knowledge improves performance significantly. More importantly, we show improvement in generalization to unseen scenes and/or objects. The supplementary video can be accessed at the following link: https://youtu.be/otKjuO805dE .
翻译:人类如何在新的场景中导航到目标物体? 我们是否使用我们多年来建立的语义/功能前科来有效搜索和导航? 例如,我们搜索杯子,我们在咖啡机附近搜索柜子,在冰箱里搜索水果。 在这项工作中,我们侧重于将语义前科纳入语义导航任务中。 我们提议使用图集网络将先前的知识纳入深强化学习框架。 代理人利用知识图的特征来预测行动。 在评估中,我们使用 AI2-THOR 框架。 我们的实验展示了语义知识如何显著改善业绩。 更重要的是, 我们展示了对看不见的场景和/或物体的一般化。 补充视频可以在以下链接上访问 : https://youtu.be/otKjuO805E 。