Data augmentation is an important technique to improve data efficiency and save labeling cost for 3D detection in point clouds. Yet, existing augmentation policies have so far been designed to only utilize labeled data, which limits the data diversity. In this paper, we recognize that pseudo labeling and data augmentation are complementary, thus propose to leverage unlabeled data for data augmentation to enrich the training data. In particular, we design three novel pseudo-label based data augmentation policies (PseudoAugments) to fuse both labeled and pseudo-labeled scenes, including frames (PseudoFrame), objecta (PseudoBBox), and background (PseudoBackground). PseudoAugments outperforms pseudo labeling by mitigating pseudo labeling errors and generating diverse fused training scenes. We demonstrate PseudoAugments generalize across point-based and voxel-based architectures, different model capacity and both KITTI and Waymo Open Dataset. To alleviate the cost of hyperparameter tuning and iterative pseudo labeling, we develop a population-based data augmentation framework for 3D detection, named AutoPseudoAugment. Unlike previous works that perform pseudo-labeling offline, our framework performs PseudoAugments and hyperparameter tuning in one shot to reduce computational cost. Experimental results on the large-scale Waymo Open Dataset show our method outperforms state-of-the-art auto data augmentation method (PPBA) and self-training method (pseudo labeling). In particular, AutoPseudoAugment is about 3X and 2X data efficient on vehicle and pedestrian tasks compared to prior arts. Notably, AutoPseudoAugment nearly matches the full dataset training results, with just 10% of the labeled run segments on the vehicle detection task.
翻译:数据增强是提高数据效率和为点云中3D检测节省标签成本的重要技术。 然而, 现有的增强政策到目前为止只设计了使用标签数据, 从而限制数据的多样性。 在本文件中, 我们认识到伪标签和数据增强是互补的, 从而提议利用未贴标签的数据来增强数据, 以丰富培训数据。 特别是, 我们设计了三种新颖的假标签数据增强政策( 假标签数据增强政策 ), 以将标签和假标签显示的场景( 框架( eeudoFrame )、 对象( PsedoBBoxxxxxxxxxxx) 结合在一起, 包括框架( eperpardodoBox) 和背景( Psedoboxboxxxxxxxxxxx), 我们开发了一个基于人口- 数据增强框架的框架, 用于检测、 自动打印工具- 之前的版本数据测试, 测试前一个成本- 演示工具- 演示工具- 上, 演示一个成本- 演示工具- 演示工具- 演示工具- 演示前的高级智能- 演示工具- 演示工具- 高级智能数据 高级智能 演示到高级智能- 高级智能- 高级智能 高级智能智能智能智能智能智能智能数据 高级智能数据 高级智能数据 高级智能数据 高级智能智能数据 高级智能数据 高级智能数据 高级智能数据 高级智能数据 演示 演示演示演示演示演示 演示 演示演示演示演示