We present JRDB, a novel egocentric dataset collected from our social mobile manipulator JackRabbot. The dataset includes 64 minutes of annotated multimodal sensor data including stereo cylindrical 360$^\circ$ RGB video at 15 fps, 3D point clouds from two Velodyne 16 Lidars, line 3D point clouds from two Sick Lidars, audio signal, RGB-D video at 30 fps, 360$^\circ$ spherical image from a fisheye camera and encoder values from the robot's wheels. Our dataset incorporates data from traditionally underrepresented scenes such as indoor environments and pedestrian areas, all from the ego-perspective of the robot, both stationary and navigating. The dataset has been annotated with over 2.3 million bounding boxes spread over 5 individual cameras and 1.8 million associated 3D cuboids around all people in the scenes totaling over 3500 time consistent trajectories. Together with our dataset and the annotations, we launch a benchmark and metrics for 2D and 3D person detection and tracking. With this dataset, which we plan on extending with further types of annotation in the future, we hope to provide a new source of data and a test-bench for research in the areas of egocentric robot vision, autonomous navigation, and all perceptual tasks around social robotics in human environments.
翻译:我们介绍JRDB,这是从社会移动操纵者Jack Rabbot那里收集的新颖的自我中心数据集。该数据集包含64分钟的附加说明的多式联运传感器数据,包括立体立体360 ⁇ circ$ RGB 视频,来自两个维洛迪内16里达尔的3D点云,来自两个病理利达尔的3D点云,音频信号,来自两个病理利达尔的3D点云,RGB-D视频,来自30英尺的30英尺,360 ⁇ circ$球形图像,来自一个鱼眼相机和机器人轮子的编码值。我们的数据集包含来自室内环境和行人区等传统上代表不足的场景的数据,所有数据来自机器人的自我透视镜,都来自固定和导航中的自我透视器。数据集有超过230万个捆绑框的云,分布在5个单独摄像机和180万个相关的3D幼科云层上,总共超过3500个时间的轨迹。连同我们的数据集和图示,我们所有2D和3D人行域的探测和跟踪区域。我们用一个基准和基准和指标定位的模型的定位,我们用新的搜索中的新搜索空间空间空间空间空间的搜索空间的搜索空间中,我们计划将一个未来空间空间空间空间空间空间空间空间空间空间空间的搜索数据源。我们提供了一个未来空间空间空间空间空间空间空间空间空间空间空间空间空间空间空间空间空间空间空间空间空间空间空间的探索的探索的再数据。