Incorporating domain-specific priors in search and navigation tasks has shown promising results in improving generalization and sample complexity over end-to-end trained policies. In this work, we study how object embeddings that capture spatial semantic priors can guide search and navigation tasks in a structured environment. We know that humans can search for an object like a book, or a plate in an unseen house, based on the spatial semantics of bigger objects detected. For example, a book is likely to be on a bookshelf or a table, whereas a plate is likely to be in a cupboard or dishwasher. We propose a method to incorporate such spatial semantic awareness in robots by leveraging pre-trained language models and multi-relational knowledge bases as object embeddings. We demonstrate using these object embeddings to search a query object in an unseen indoor environment. We measure the performance of these embeddings in an indoor simulator (AI2Thor). We further evaluate different pre-trained embedding onSuccess Rate(SR) and success weighted by Path Length(SPL).
翻译:在搜索和导航任务中加入特定域的前题,在改进端到端经过训练的政策的概括性和样本复杂性方面,显示了有希望的成果。在这项工作中,我们研究了将空间语义前题捕捉到的物体嵌入中如何在结构化的环境中指导搜索和导航任务。我们知道,人类可以根据所检测到的较大物体的空间语义,在看不见的房屋中搜索书籍或板块。例如,一本书可能放在书架上或桌子上,而板块可能放在橱柜或洗碗机中。我们提出一种方法,利用预先训练的语言模型和多关系知识基础作为对象嵌入,将这种空间语义意识纳入机器人中。我们用这些物体嵌入中的方法来在看不见的室内环境中搜索一个查询对象。我们测量这些嵌入在室内模拟器(AI2Thor)中的性能。我们进一步评估了不同预先训练前嵌入超速率和由路径长(SPL)加权成功率的性能。