In order to handle the challenges of autonomous driving, deep learning has proven to be crucial in tackling increasingly complex tasks, such as 3D detection or instance segmentation. State-of-the-art approaches for image-based detection tasks tackle this complexity by operating in a cascaded fashion: they first extract a 2D bounding box based on which additional attributes, e.g. instance masks, are inferred. While these methods perform well, a key challenge remains the lack of accurate and cheap annotations for the growing variety of tasks. Synthetic data presents a promising solution but, despite the effort in domain adaptation research, the gap between synthetic and real data remains an open problem. In this work, we propose a weakly supervised domain adaptation setting which exploits the structure of cascaded detection tasks. In particular, we learn to infer the attributes solely from the source domain while leveraging 2D bounding boxes as weak labels in both domains to explain the domain shift. We further encourage domain-invariant features through class-wise feature alignment using ground-truth class information, which is not available in the unsupervised setting. As our experiments demonstrate, the approach is competitive with fully supervised settings while outperforming unsupervised adaptation approaches by a large margin.
翻译:为了应对自主驾驶的挑战,深层次学习已证明在应对日益复杂的任务方面至关重要,如3D探测或试样分割等。图像检测任务最先进的方法以级联方式运作,从而解决这一复杂问题:它们首先提取一个基于额外属性的2D约束框,例如掩体,根据这个框推断出额外的属性。虽然这些方法效果良好,但关键的挑战仍然是缺乏准确和廉价的说明,以解释日益多样化的任务。合成数据是一个有希望的解决方案,但尽管在领域适应研究方面做出了努力,合成数据和真实数据之间的差距仍是一个开放的问题。在这项工作中,我们提出一个薄弱的监管域适应设置,利用级联探测任务的结构。特别是,我们学会从源域中单独推算属性,同时利用2D约束框作为两个领域的薄弱标签,解释领域的变化。我们进一步鼓励通过使用不严密的分类特征调整,利用地面图解类信息来鼓励域内不易获得的特征。我们进行的实验表明,该方法具有竞争力,同时通过不严密监督的调整方式,与完全监督的设置相竞争。