It is challenging to train a robust object detector when annotated data is scarce. Existing approaches to tackle this problem include semi-supervised learning that interpolates labeled data from unlabeled data, self-supervised learning that exploit signals within unlabeled data via pretext tasks. Without changing the supervised learning paradigm, we introduce an offline data augmentation method for object detection, which semantically interpolates the training data with novel views. Specifically, our proposed system generates controllable views of training images based on differentiable neural rendering, together with corresponding bounding box annotations which involve no human intervention. Firstly, we extract and project pixel-aligned image features into point clouds while estimating depth maps. We then re-project them with a target camera pose and render a novel-view 2d image. Objects in the form of keypoints are marked in point clouds to recover annotations in new views. It is fully compatible with online data augmentation methods, such as affine transform, image mixup, etc. Extensive experiments show that our method, as a cost-free tool to enrich images and labels, can significantly boost the performance of object detection systems with scarce training data. Code is available at \url{https://github.com/Guanghan/DANR}.
翻译:当附加说明的数据缺乏时,培训一个强健的物体探测器是一项艰巨的任务。 解决这一问题的现有方法包括:通过半监督的学习,从未贴标签的数据中将标记的图像定位成有标签的数据,通过借口任务利用未贴标签数据中的信号进行自我监督的学习。 在不改变受监督的学习范式的情况下,我们引入了一种用于物体探测的离线数据增强方法,用新观点将培训数据进行分解。具体地说,我们提议的系统生成了基于不同神经结构的可控化培训图像的可控视图,以及没有人类干预的相应捆绑框说明。首先,我们在估计深度地图时,从点云中提取和投放相像的图像特征。然后,我们用一个目标相机进行重新预测,并制作一个新颖的2D图像。 以关键点的形式的物体标记在点云中,用新观点来恢复说明。 它与在线数据增强方法完全兼容,例如缝合、图像混合等。 广泛的实验表明,我们的方法,作为免费的放大图像和标签工具,可以大大提升物体探测系统的功能。