Light field cameras, by harnessing the power of micro-lens array, are capable of capturing intricate angular and spatial details. This allows for acquiring complex light patterns and details from multiple angles, significantly enhancing the precision of image semantic segmentation, a critical aspect of scene interpretation in vision intelligence. However, the extensive angular information of light field cameras contains a large amount of redundant data, which is overwhelming for the limited hardware resources of intelligent vehicles. Besides, inappropriate compression leads to information corruption and data loss. To excavate representative information, we propose a new paradigm, Omni-Aperture Fusion model (OAFuser), which leverages dense context from the central view and discovers the angular information from sub-aperture images to generate a semantically consistent result. To avoid feature loss during network propagation and simultaneously streamline the redundant information from the light field camera, we present a simple yet very effective Sub-Aperture Fusion Module (SAFM) to embed sub-aperture images into angular features without any additional memory cost. Furthermore, to address the mismatched spatial information across viewpoints, we present a Center Angular Rectification Module (CARM) to realize feature resorting and prevent feature occlusion caused by asymmetric information. Our proposed OAFuser achieves state-of-the-art performance on the UrbanLF-Real and -Syn datasets and sets a new record of 84.93% in mIoU on the UrbanLF-Real Extended dataset, with a gain of +4.53%. The source code of OAFuser will be available at https://github.com/FeiBryantkit/OAFuser.
翻译:暂无翻译