In this paper we address the problem of training a LiDAR semantic segmentation network using a fully-labeled source dataset and a target dataset that only has a small number of labels. To this end, we develop a novel image-to-image translation engine, and couple it with a LiDAR semantic segmentation network, resulting in an integrated domain adaptation architecture we call HYLDA. To train the system end-to-end, we adopt a diverse set of learning paradigms, including 1) self-supervision on a simple auxiliary reconstruction task, 2) semi-supervised training using a few available labeled target domain frames, and 3) unsupervised training on the fake translated images generated by the image-to-image translation stage, together with the labeled frames from the source domain. In the latter case, the semantic segmentation network participates in the updating of the image-to-image translation engine. We demonstrate experimentally that HYLDA effectively addresses the challenging problem of improving generalization on validation data from the target domain when only a few target labeled frames are available for training. We perform an extensive evaluation where we compare HYLDA against strong baseline methods using two publicly available LiDAR semantic segmentation datasets.
翻译:在本文中,我们讨论了使用全标签源数据集和仅有少量标签的目标数据集培训LiDAR语义分解网络的问题。为此,我们开发了一个新的图像到图像翻译引擎,并将其与LiDAR语义分解网络相配,从而形成一个我们称为HYLDA的综合域适应结构。为培训系统端至端,我们采用了一套不同的学习模式,包括:(1) 简单辅助重建任务的自我监督观察;(2) 使用少量现有标签目标域框进行半监督培训;(3) 以图像到图像翻译阶段产生的伪造翻译图像进行不受监督的培训,以及源域的标签框架。在后一种情况下,语义分解网络参与更新图像到图像翻译引擎。我们实验性地表明,HYLDA有效解决了在目标域改进验证数据一般化方面的棘手问题,因为只有少数标框可供培训使用。我们用高标框进行广泛评估,用高标码的SYALD数据进行公开比较。