Prevalent semantic segmentation solutions are, in essence, a dense discriminative classifier of p(class|pixel feature). Though straightforward, this de facto paradigm neglects the underlying data distribution p(pixel feature|class), and struggles to identify out-of-distribution data. Going beyond this, we propose GMMSeg, a new family of segmentation models that rely on a dense generative classifier for the joint distribution p(pixel feature,class). For each class, GMMSeg builds Gaussian Mixture Models (GMMs) via Expectation-Maximization (EM), so as to capture class-conditional densities. Meanwhile, the deep dense representation is end-to-end trained in a discriminative manner, i.e., maximizing p(class|pixel feature). This endows GMMSeg with the strengths of both generative and discriminative models. With a variety of segmentation architectures and backbones, GMMSeg outperforms the discriminative counterparts on three closed-set datasets. More impressively, without any modification, GMMSeg even performs well on open-world datasets. We believe this work brings fundamental insights into the related fields.
翻译:从本质上看,GMMSeg本质上是p(类 ⁇ /像素特性特性)的密集分解器。虽然直截了当,但事实上的范式忽略了基本数据分布p(像素特性 ⁇ 类),并努力辨别分配之外的数据。在此之后,我们提议GMMSeg,这是一组新的分解模型,它依赖于一个密集的分解分解器,用于联合分配p(像素特性,类)。对于每一类,GMMSeg,GMMSeg通过期望-氧化化(EM)建立高斯混合模型(GMMMM),以便捕捉到类条件密度。与此同时,深厚厚的代言方式是用歧视方式训练的端到端,即最大化p(象素特性) 。这让GMMSegmenteg在基因化和歧视性模型的长处都具有优势。在各种分解结构和骨架上,GMMSeg在三个封闭的数据集中超越了歧视对应方。更令人印象深刻的是,没有任何修改,GMseggment-egment in the must refrial suds theworks