对比度和混和: 与背景混和相容的实时对调视频域域 (Contrast and Mix: Temporal Contrastive Video Domain Adaptation with Background Mixing)

Unsupervised domain adaptation which aims to adapt models trained on a labeled source domain to a completely unlabeled target domain has attracted much attention in recent years. While many domain adaptation techniques have been proposed for images, the problem of unsupervised domain adaptation in videos remains largely underexplored. In this paper, we introduce Contrast and Mix (CoMix), a new contrastive learning framework that aims to learn discriminative invariant feature representations for unsupervised video domain adaptation. First, unlike existing methods that rely on adversarial learning for feature alignment, we utilize temporal contrastive learning to bridge the domain gap by maximizing the similarity between encoded representations of an unlabeled video at two different speeds as well as minimizing the similarity between different videos played at different speeds. Second, we propose a novel extension to the temporal contrastive loss by using background mixing that allows additional positives per anchor, thus adapting contrastive learning to leverage action semantics shared across both domains. Moreover, we also integrate a supervised contrastive learning objective using target pseudo-labels to enhance discriminability of the latent space for video domain adaptation. Extensive experiments on several benchmark datasets demonstrate the superiority of our proposed approach over state-of-the-art methods. Project page: https://cvir.github.io/projects/comix

翻译：未经监督的域适应旨在将在标签源域上培训的模型改造成完全没有标签的目标域,这在最近几年引起了人们的极大关注。虽然提出了许多用于图像的域适应技术,但视频中未经监督的域适应问题在很大程度上仍未得到充分探讨。在本文中,我们引入了对比和Mix(CoMix),这是一个新的对比学习框架,目的是学习在标签源域上培训的差别性特征表现,用于不受监督的视频域适应。首先,与依赖对抗性学习来调整特征的现有方法不同,我们利用时间对比性学习来缩小域间差距,办法是以两种不同速度对未标记的视频进行编码表达,同时尽量减少以不同速度播放的不同视频之间的相似性。第二,我们提出对时间对比性损失进行新的扩展,方法是使用背景混合,允许额外的正点固定,从而调整对比性学习,以利用在两个域间共享的行动语义学。此外,我们还结合了一种监督的对比性学习目标,使用目标假标签,以加强视频域内隐性空间的可视性,以两种不同的速度进行对比性调整。跨度实验:Meurgistial-prisquestal-proview produstrismismismismismismismismismismismismismismismismismation。