Machine learning provides a powerful tool for building socially compliant robotic systems that go beyond simple predictive models of human behavior. By observing and understanding human interactions from past experiences, learning can enable effective social navigation behaviors directly from data. However, collecting navigation data in human-occupied environments may require teleoperation or continuous monitoring, making the process prohibitively expensive to scale. In this paper, we present a scalable data collection system for vision-based navigation, SACSoN, that can autonomously navigate around pedestrians in challenging real-world environments while encouraging rich interactions. SACSoN uses visual observations to observe and react to humans in its vicinity. It couples this visual understanding with continual learning and an autonomous collision recovery system that limits the involvement of a human operator, allowing for better dataset scaling. We use a this system to collect the SACSoN dataset, the largest-of-its-kind visual navigation dataset of autonomous robots operating in human-occupied spaces, spanning over 75 hours and 4000 rich interactions with humans. Our experiments show that collecting data with a novel objective that encourages interactions, leads to significant improvements in downstream tasks such as inferring pedestrian dynamics and learning socially compliant navigation behaviors. We make videos of our autonomous data collection system and the SACSoN dataset publicly available on our project page.
翻译:暂无翻译