通过自空间-时时标签标签扩散了解未受监督的福吉场景 (Unsupervised Foggy Scene Understanding via Self Spatial-Temporal Label Diffusion)

Understanding foggy image sequence in the driving scenes is critical for autonomous driving, but it remains a challenging task due to the difficulty in collecting and annotating real-world images of adverse weather. Recently, the self-training strategy has been considered a powerful solution for unsupervised domain adaptation, which iteratively adapts the model from the source domain to the target domain by generating target pseudo labels and re-training the model. However, the selection of confident pseudo labels inevitably suffers from the conflict between sparsity and accuracy, both of which will lead to suboptimal models. To tackle this problem, we exploit the characteristics of the foggy image sequence of driving scenes to densify the confident pseudo labels. Specifically, based on the two discoveries of local spatial similarity and adjacent temporal correspondence of the sequential image data, we propose a novel Target-Domain driven pseudo label Diffusion (TDo-Dif) scheme. It employs superpixels and optical flows to identify the spatial similarity and temporal correspondence, respectively and then diffuses the confident but sparse pseudo labels within a superpixel or a temporal corresponding pair linked by the flow. Moreover, to ensure the feature similarity of the diffused pixels, we introduce local spatial similarity loss and temporal contrastive loss in the model re-training stage. Experimental results show that our TDo-Dif scheme helps the adaptive model achieve 51.92% and 53.84% mean intersection-over-union (mIoU) on two publicly available natural foggy datasets (Foggy Zurich and Foggy Driving), which exceeds the state-of-the-art unsupervised domain adaptive semantic segmentation methods. Models and data can be found at https://github.com/velor2012/TDo-Dif.

翻译：了解驾驶场中的雾雾图像序列对于自主驱动至关重要, 但对于自动驱动来说,这仍然是一个挑战性的任务。最近, 自我培训战略被认为是一个强大的解决方案, 用于不受监督的域适应。它通过生成目标假标签和对模型进行再培训, 将模型从源域反复调整到目标域。但是, 选择自信假标签不可避免地会受到空间和准确性之间的冲突, 两者都会导致不优化的模型。为了解决这个问题, 我们利用驾驶场的雾图像序列特征, 使有信心的假标签变密。具体地说, 根据当地空间相似性和相近的相近时间对应图像数据对应的两种发现, 我们提出一个新的目标驱动的假标签Difcl( TDo-Dif) 方案。它使用超级像素和光学流来识别空间相似性和时间对应性20 20 的模型相似性和时间性对应性。它们分别会传播在超级像素或不常用的奥氏度上找到的软化的伪化标签。在流动中, 我们的磁度- D- D- d- d- dal- dal- disal- disal- disal- dislation 中, 系统会显示将产生相似性变变变变变换的模型。