Realistic simulators are critical for training and verifying robotics systems. While most of the contemporary simulators are hand-crafted, a scaleable way to build simulators is to use machine learning to learn how the environment behaves in response to an action, directly from data. In this work, we aim to learn to simulate a dynamic environment directly in pixel-space, by watching unannotated sequences of frames and their associated action pairs. We introduce a novel high-quality neural simulator referred to as DriveGAN that achieves controllability by disentangling different components without supervision. In addition to steering controls, it also includes controls for sampling features of a scene, such as the weather as well as the location of non-player objects. Since DriveGAN is a fully differentiable simulator, it further allows for re-simulation of a given video sequence, offering an agent to drive through a recorded scene again, possibly taking different actions. We train DriveGAN on multiple datasets, including 160 hours of real-world driving data. We showcase that our approach greatly surpasses the performance of previous data-driven simulators, and allows for new features not explored before.
翻译:现实模拟器对于培训和核查机器人系统至关重要。 虽然大多数当代模拟器都是手工制作的, 建立模拟器的一个可缩放的方法是使用机器学习来学习环境如何对一动作作出反应, 直接从数据中学习。 在这项工作中, 我们的目标是通过观看无注释的框序列及其相关的动作对等来学习在像素空间直接模拟动态环境。 我们引入了新型的高质量神经模拟器, 称为驱动器GAN, 通过不经监督地拆解不同部件来实现控制。 除了指导控制外, 它还包括控制一个场景的取样特征, 如天气和非播放器对象的位置。 由于驱动器是一个完全不同的模拟器, 我们进一步允许重新模拟一个给定的视频序列, 提供再次驱动器, 以再次通过记录场景, 可能采取不同的动作。 我们用多个数据集对驱动器进行了培训, 包括160小时的实际驱动数据。 我们展示了我们的方法在探索之前无法大大超过先前数据驱动的特性。