LiDAR object detection algorithms based on neural networks for autonomous driving require large amounts of data for training, validation, and testing. As real-world data collection and labeling are time-consuming and expensive, simulation-based synthetic data generation is a viable alternative. However, using simulated data for the training of neural networks leads to a domain shift of training and testing data due to differences in scenes, scenarios, and distributions. In this work, we quantify the sim-to-real domain shift by means of LiDAR object detectors trained with a new scenario-identical real-world and simulated dataset. In addition, we answer the questions of how well the simulated data resembles the real-world data and how well object detectors trained on simulated data perform on real-world data. Further, we analyze point clouds at the target-level by comparing real-world and simulated point clouds within the 3D bounding boxes of the targets. Our experiments show that a significant sim-to-real domain shift exists even for our scenario-identical datasets. This domain shift amounts to an average precision reduction of around 14 % for object detectors trained with simulated data. Additional experiments reveal that this domain shift can be lowered by introducing a simple noise model in simulation. We further show that a simple downsampling method to model real-world physics does not influence the performance of the object detectors.
翻译:自动驾驶神经网络的LIDAR天体检测算法需要大量的培训、验证和测试数据。 真实世界的数据收集和标签耗费时间且昂贵, 模拟合成数据生成是一个可行的替代方法。 但是, 使用模拟数据培训神经网络, 导致培训和测试数据的域变, 原因是场景、 情景和分布的差异。 在这项工作中, 我们通过LIDAR天体探测器量化了Sim- 真实域变数, 培训了一个新的假想- 相同的真实世界和模拟数据集。 此外, 我们回答的问题是, 模拟数据与真实世界数据非常相似, 模拟合成数据培训的对象探测器如何在真实世界数据上运行。 此外, 我们通过比较真实世界和3D 边框中的模拟云来分析目标水平云值。 我们的实验显示, 即使在我们的假想- 相世界和模拟数据集下, 也存在显著的 Sim- 真实域变数。 此外, 我们的域变数相当于平均减少14 %左右的模型。 我们用模拟模型演示的物体测算器, 能够进一步显示一个简单的轨道变更。</s>