In this work, we introduce panoramic panoptic segmentation, as the most holistic scene understanding, both in terms of Field of View (FoV) and image-level understanding for standard camera-based input. A complete surrounding understanding provides a maximum of information to a mobile agent, which is essential for any intelligent vehicle in order to make informed decisions in a safety-critical dynamic environment such as real-world traffic. In order to overcome the lack of annotated panoramic images, we propose a framework which allows model training on standard pinhole images and transfers the learned features to a different domain in a cost-minimizing way. Using our proposed method with dense contrastive learning, we manage to achieve significant improvements over a non-adapted approach. Depending on the efficient panoptic segmentation architecture, we can improve 3.5-6.5% measured in Panoptic Quality (PQ) over non-adapted models on our established Wild Panoramic Panoptic Segmentation (WildPPS) dataset. Furthermore, our efficient framework does not need access to the images of the target domain, making it a feasible domain generalization approach suitable for a limited hardware setting. As additional contributions, we publish WildPPS: The first panoramic panoptic image dataset to foster progress in surrounding perception and explore a novel training procedure combining supervised and contrastive training.
翻译:在这项工作中,我们引入了全景光谱部分,作为最全面的场景理解,既包括观察场(FoV),也包括对标准摄像材料的图像级理解。完整的周围了解为移动剂提供了最大程度的信息,这对于任何智能车辆都至关重要,以便在真实世界交通等安全关键动态环境中作出知情决定。为了克服缺少附加说明的全景图像,我们提议了一个框架,允许对标准针孔图像进行示范培训,并以成本最小化的方式将学到的特征传输到不同的领域。我们利用我们提议的、密集对比学习的方法,我们设法在非适应性方法的基础上取得重大改进。根据高效的全景光分层结构,我们可以改进在全景质量(PQ)中测量到的3.5-6.5%,而不是在已建立的野景光谱截面(WildPPS)数据集中测量到的模型。此外,我们的高效框架不需要以成本最小化的方式访问目标领域图像,而使之成为一个可行的通用的域域化方法,使之适合于将硬质图像组合成一个新的版本。作为补充的模型,我们出版的风险评估程序。