We propose SplitNet, a method for decoupling visual perception and policy learning. By incorporating auxiliary tasks and selective learning of portions of the model, we explicitly decompose the learning objectives for visual navigation into perceiving the world and acting on that perception. We show dramatic improvements over baseline models on transferring between simulators, an encouraging step towards Sim2Real. Additionally, SplitNet generalizes better to unseen environments from the same simulator and transfers faster and more effectively to novel embodied navigation tasks. Further, given only a small sample from a target domain, SplitNet can match the performance of traditional end-to-end pipelines which receive the entire dataset. Code is available https://github.com/facebookresearch/splitnet
翻译:我们提出SplitNet,这是将视觉感知和政策学习脱钩的一种方法。我们通过将辅助任务和选择性地学习模型的部分内容纳入其中,明确将视觉导航的学习目标分解为对世界的感知,并据此采取行动。我们展示了与模拟器之间转移的基准模型相比的巨大改进,这是向Sim2Real迈出的令人鼓舞的一步。此外,SlitNet从同一个模拟器向看不见环境进行更好的概括,并更快和更有效地向新颖的体现导航任务转移。此外,由于目标域只有少量的样本,SlipNet可以匹配接收整个数据集的传统端到端管道的性能。代码见https://github.com/facebourseresearch/splitnet。代码见https://gthub.form/facebookreearch/splitnet。