Dynamics modeling in outdoor and unstructured environments is difficult because different elements in the environment interact with the robot in ways that can be hard to predict. Leveraging multiple sensors to perceive maximal information about the robot's environment is thus crucial when building a model to perform predictions about the robot's dynamics with the goal of doing motion planning. We design a model capable of long-horizon motion predictions, leveraging vision, lidar and proprioception, which is robust to arbitrarily missing modalities at test time. We demonstrate in simulation that our model is able to leverage vision to predict traction changes. We then test our model using a real-world challenging dataset of a robot navigating through a forest, performing predictions in trajectories unseen during training. We try different modality combinations at test time and show that, while our model performs best when all modalities are present, it is still able to perform better than the baseline even when receiving only raw vision input and no proprioception, as well as when only receiving proprioception. Overall, our study demonstrates the importance of leveraging multiple sensors when doing dynamics modeling in outdoor conditions.
翻译:在户外和无结构环境中建模动态是困难的,因为环境中的不同元素以难以预测的方式与机器人发生相互作用。因此,利用多个传感器来了解关于机器人环境的最大信息,对于建立模型来预测机器人动态以进行运动规划为目标的模型至关重要。我们设计了一个模型,能够进行长方位运动预测,利用视觉、利达尔和自行感知,这种模型在测试时强于任意缺失的模式。我们在模拟中显示,我们的模型能够利用视觉预测力来预测力的变化。然后我们用一个现实世界的机器人在森林中航行,在培训期间看不见的轨迹中进行预测来测试我们的模型。我们在测试时尝试不同模式的组合,并表明,尽管我们模型在所有模式存在时都最优秀地运行,但仍然能够比基线更好地运行,即使只接收原始的视觉输入和无正方位感知,以及只接收方位感知力。总体而言,我们的研究表明在室外进行动态建模时利用多个传感器的重要性。