In a human-robot collaborative task where a robot helps its partner by finding described objects, the depth dimension plays a critical role in successful task completion. Existing studies have mostly focused on comprehending the object descriptions using RGB images. However, 3-dimensional space perception that includes depth information is fundamental in real-world environments. In this work, we propose a method to identify the described objects considering depth dimension data. Using depth features significantly improves performance in scenes where depth data is critical to disambiguate the objects and across our whole evaluation dataset that contains objects that can be specified with and without the depth dimension.
翻译:在一个机器人通过寻找描述的物体来帮助其伙伴的人类机器人合作任务中,深度维度在成功完成任务方面发挥着关键作用。现有研究主要侧重于利用RGB图像理解物体描述。然而,包含深度信息的三维空间认知在现实世界环境中是根本的。在这项工作中,我们提出一种方法来确定所描述的物体,以考虑深度维度数据。在深度数据对辨别物体至关重要的场景中,利用深度特征可以大大改善性能,而深度数据对辨别物体至关重要,并且贯穿我们的整个评价数据集,该数据集包含可以用深度维度来说明的物体。