In recent years, computer vision tasks like target tracking and human pose estimation have immensely benefited from synthetic data generation and novel rendering techniques. On the other hand, methods in robotics, especially for robot perception, have been slow to leverage these techniques. This is because state-of-the-art simulation frameworks for robotics lack either complete control, integration with the Robot Operating System (ROS), realistic physics or photorealism. To solve this, we present a fully customizable framework for generating realistic animated dynamic environments (GRADE) for robotics research, focused primarily at robot perception. The framework can be used either to generate ground truth data for robotic vision-related tasks and offline processing, or to experiment with robots online in dynamic environments. We build upon the Nvidia Isaac Sim to allow control of custom robots. We provide methods to include assets, populate and control the simulation, and process the data. Using autonomous robots in GRADE, we generate video datasets of an indoor dynamic environment. First, we use it to demonstrate the framework's visual realism by evaluating the sim-to-real gap through experiments with YOLO and Mask R-CNN. Second, we benchmark dynamic SLAM algorithms with this dataset. This not only shows that GRADE can significantly improve training performance and generalization to real sequences, but also highlights how current dynamic SLAM methods over-rely on known benchmarks, failing to generalize. We also introduce a method to precisely repeat a previously recorded experiment, while allowing changes in the surroundings of the robot. Code and data are provided as open-source at https://grade.is.tue.mpg.de.
翻译:暂无翻译