While self-training has advanced semi-supervised semantic segmentation, it severely suffers from the long-tailed class distribution on real-world semantic segmentation datasets that make the pseudo-labeled data bias toward majority classes. In this paper, we present a simple and yet effective Distribution Alignment and Random Sampling (DARS) method to produce unbiased pseudo labels that match the true class distribution estimated from the labeled data. Besides, we also contribute a progressive data augmentation and labeling strategy to facilitate model training with pseudo-labeled data. Experiments on both Cityscapes and PASCAL VOC 2012 datasets demonstrate the effectiveness of our approach. Albeit simple, our method performs favorably in comparison with state-of-the-art approaches. Code will be available at https://github.com/CVMI-Lab/DARS.
翻译:虽然自我培训已经推进半监督的语义分解,但它严重受损于真实世界语义分解数据集的长期尾端分类分布,这使得假标签数据偏向多数类。 在本文中,我们提出了一个简单而有效的分布对齐和随机抽样(DARS)方法,以产生与标签数据估计的真实类分布相匹配的无偏见假标签。 此外,我们还促进逐步增加数据和标签战略,以便利使用假标签数据进行模型培训。 对城市景景和PASAL VOC 2012 数据集的实验显示了我们的方法的有效性。 简单而简单,我们的方法与最先进的方法相比表现得更好。 代码将在https://github.com/CVMI-Lab/DARS网站上公布。