盒子外:真实世界的内装导航 (Out of the Box: Embodied Navigation in the Real World)

The research field of Embodied AI has witnessed substantial progress in visual navigation and exploration thanks to powerful simulating platforms and the availability of 3D data of indoor and photorealistic environments. These two factors have opened the doors to a new generation of intelligent agents capable of achieving nearly perfect PointGoal Navigation. However, such architectures are commonly trained with millions, if not billions, of frames and tested in simulation. Together with great enthusiasm, these results yield a question: how many researchers will effectively benefit from these advances? In this work, we detail how to transfer the knowledge acquired in simulation into the real world. To that end, we describe the architectural discrepancies that damage the Sim2Real adaptation ability of models trained on the Habitat simulator and propose a novel solution tailored towards the deployment in real-world scenarios. We then deploy our models on a LoCoBot, a Low-Cost Robot equipped with a single Intel RealSense camera. Different from previous work, our testing scene is unavailable to the agent in simulation. The environment is also inaccessible to the agent beforehand, so it cannot count on scene-specific semantic priors. In this way, we reproduce a setting in which a research group (potentially from other fields) needs to employ the agent visual navigation capabilities as-a-Service. Our experiments indicate that it is possible to achieve satisfying results when deploying the obtained model in the real world. Our code and models are available at https://github.com/aimagelab/LoCoNav.

翻译：由于强大的模拟平台以及室内和摄影现实环境的3D数据的存在,Ambbodied AI的研究领域在视觉导航和探索方面取得了巨大进展。这两个因素为新一代智能剂打开了大门,这些智能剂能够实现近乎完美的目标导航。然而,这些建筑通常经过数百万甚至数十亿个框架的训练并在模拟中测试。这些结果加上巨大的热情,产生了一个问题:有多少研究人员将切实受益于这些进步?在这项工作中,我们详细说明如何将模拟获得的知识转移到真实世界。为此,我们描述了在生境模拟器上培训的模型的Sim2Real适应能力受到破坏的建筑差异,并提出了针对现实世界情景部署的新型解决方案。我们随后将我们的模型安装在LoCoBot上,一个装有单一Intel realSense相机的低科机器人。不同于以往的工作,我们的模拟代理人无法进入我们的测试场。之前,代理人也无法进入环境,因此无法指望在现场特定的文体模型上找到它。在真实的模型上,我们复制了我们所具备的智能实验室,在真实的实验室上展示了我们所具备的模型。我们所具备的实验室。在真实的实验室上,我们所具备的模型。我们所具备的实验室,我们所具备的模型。

相关内容

MoDELS

关注 43

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/