The semantic segmentation task aims at dense classification at the pixel-wise level. Deep models exhibited progress in tackling this task. However, one remaining problem with these approaches is the loss of spatial precision, often produced at the segmented objects' boundaries. Our proposed model addresses this problem by providing an internal structure for the feature representations while extracting a global representation that supports the former. To fit the internal structure, during training, we predict a Gaussian Mixture Model from the data, which, merged with the skip connections and the decoding stage, helps avoid wrong inductive biases. Furthermore, our results show that we can improve semantic segmentation by providing both learning representations (global and local) with a clustering behavior and combining them. Finally, we present results demonstrating our advances in Cityscapes and Synthia datasets.
翻译:语义分解任务的目标是在像素层面进行密集分类。 深层模型在完成这项任务时表现出了进步。 但是, 这些方法的剩余问题是空间精确度的丧失, 通常是在被分割对象的边界上产生的。 我们提议的模型通过为特征表达提供一个内部结构来解决这个问题, 同时提取一种支持前者的全球代表形式。 为了适应内部结构, 在培训期间, 我们从数据中预测高斯混合模型, 这些数据与跳过连接和解码阶段相结合, 有助于避免错误的诱导偏差。 此外, 我们的结果显示, 通过提供学习代表( 全球和本地) 的组合行为, 并结合它们, 我们可以改善语义分解。 最后, 我们展示了我们在城市景和合成数据集中的进展。