Semantic segmentation is one of the most fundamental problems in computer vision with significant impact on a wide variety of applications. Adversarial learning is shown to be an effective approach for improving semantic segmentation quality by enforcing higher-level pixel correlations and structural information. However, state-of-the-art semantic segmentation models cannot be easily plugged into an adversarial setting because they are not designed to accommodate convergence and stability issues in adversarial networks. We bridge this gap by building a conditional adversarial network with a state-of-the-art segmentation model (DeepLabv3+) at its core. To battle the stability issues, we introduce a novel lookahead adversarial learning (LoAd) approach with an embedded label map aggregation module. We focus on semantic segmentation models that run fast at inference for near real-time field applications. Through extensive experimentation, we demonstrate that the proposed solution can alleviate divergence issues in an adversarial semantic segmentation setting and results in considerable performance improvements (+5% in some classes) on the baseline for three standard datasets.
翻译:语义分解是计算机视觉中最根本的问题之一,对各种应用都有重大影响。 事实证明,通过实施高层次像素相关性和结构信息,对立学习是提高语义分解质量的有效方法。然而,最先进的语义分解模型无法轻易插入敌对环境,因为它们的设计没有考虑到对立网络的趋同和稳定性问题。我们通过在其核心中建立一个有条件的对立网络,以最先进的分解模型(DiepLabv3+)来弥合这一差距。为了解决稳定性问题,我们采用了一种新型的外头对立学习(LoAd)方法,并采用嵌入标签图汇总模块。我们侧重于在近实时应用中快速产生引力的语义分解模型。通过广泛的实验,我们证明拟议的解决方案可以缓解对立语义分解设置中的分歧问题,并在三个标准数据集的基线上带来显著的绩效改进(某些班级的+5% )。