Semantic Segmentation (SS) is promising for outdoor scene perception in safety-critical applications like autonomous vehicles, assisted navigation and so on. However, traditional SS is primarily based on RGB images, which limits the reliability of SS in complex outdoor scenes, where RGB images lack necessary information dimensions to fully perceive unconstrained environments. As preliminary investigation, we examine SS in an unexpected obstacle detection scenario, which demonstrates the necessity of multimodal fusion. Thereby, in this work, we present EAFNet, an Efficient Attention-bridged Fusion Network to exploit complementary information coming from different optical sensors. Specifically, we incorporate polarization sensing to obtain supplementary information, considering its optical characteristics for robust representation of diverse materials. By using a single-shot polarization sensor, we build the first RGB-P dataset which consists of 394 annotated pixel-aligned RGB-Polarization images. A comprehensive variety of experiments shows the effectiveness of EAFNet to fuse polarization and RGB information, as well as the flexibility to be adapted to other sensor combination scenarios.
翻译:然而,传统的SS主要以RGB图像为基础,这限制了在复杂的室外场景中SS的可靠性,在这种场景中,RGB图像缺乏必要的信息维度,无法充分认识不受限制的环境。作为初步调查,我们在意外的障碍探测假设中检查SS,这显示了多式联运融合的必要性。因此,我们在此工作中介绍了EAFNet,一个高效的注意-封闭融合网络,利用来自不同光学传感器的补充信息。具体地说,我们采用了极分化感测,以获取补充信息,考虑到其光学特性可以强有力地代表各种材料。我们使用单向极化传感器,建立了第一个RGB-P数据集,由394个附加说明的像素调整RGB-polarization图像组成。全面的各种实验表明EAFNet在结合极化和RGB信息方面的有效性,以及适应其他感官组合情景的灵活性。