超越视野的传输:通过无人监督的域适应进行高密度全景分布式分割 (Transfer beyond the Field of View: Dense Panoramic Semantic Segmentation via Unsupervised Domain Adaptation)

from arxiv, Accepted to IEEE Transactions on Intelligent Transportation Systems (IEEE T-ITS). Dataset and code will be made publicly available at https://github.com/chma1024/DensePASS. arXiv admin note: substantial text overlap with arXiv:2108.06383

Autonomous vehicles clearly benefit from the expanded Field of View (FoV) of 360-degree sensors, but modern semantic segmentation approaches rely heavily on annotated training data which is rarely available for panoramic images. We look at this problem from the perspective of domain adaptation and bring panoramic semantic segmentation to a setting, where labelled training data originates from a different distribution of conventional pinhole camera images. To achieve this, we formalize the task of unsupervised domain adaptation for panoramic semantic segmentation and collect DensePASS - a novel densely annotated dataset for panoramic segmentation under cross-domain conditions, specifically built to study the Pinhole-to-Panoramic domain shift and accompanied with pinhole camera training examples obtained from Cityscapes. DensePASS covers both, labelled- and unlabelled 360-degree images, with the labelled data comprising 19 classes which explicitly fit the categories available in the source (i.e. pinhole) domain. Since data-driven models are especially susceptible to changes in data distribution, we introduce P2PDA - a generic framework for Pinhole-to-Panoramic semantic segmentation which addresses the challenge of domain divergence with different variants of attention-augmented domain adaptation modules, enabling the transfer in output-, feature-, and feature confidence spaces. P2PDA intertwines uncertainty-aware adaptation using confidence values regulated on-the-fly through attention heads with discrepant predictions. Our framework facilitates context exchange when learning domain correspondences and dramatically improves the adaptation performance of accuracy- and efficiency-focused models. Comprehensive experiments verify that our framework clearly surpasses unsupervised domain adaptation- and specialized panoramic segmentation approaches.

翻译：自动车辆显然受益于360度传感器的扩大视野域(FoV),但现代语义分割法在很大程度上依赖全景图像很少能获得的附加说明的培训数据。我们从域适应的角度来看待这一问题,并将全景语义分割法带到一个环境,在这种环境中,贴贴贴的培训数据源于传统针孔相机图像的不同分布。为了实现这一点,我们正式确定了全景语义分割法不受监督域适应任务,并收集高级考绩制度 -- -- 在跨部状态条件下,全景语义分割法是一个全新的高频附加说明的数据集,专门为研究Pinthole-Panoramical域变化变化情况而建立,并配有从城市景景色中获得的针孔摄影培训实例。登印的培训数据包括标签和无标签的360度图像,由19个明确符合源(i.e.pentholooil)域域现有类别。由于数据驱动模型特别容易改变数据分布方式的变化,我们引入了P2PDA-一个通用的直观性直径直径直径定位框架,在域域域域域域域域域域域域内部调整调整过程中,从而将可促进对内变变变的视系统结构结构结构变化的注意。