Multi-view image compression plays a critical role in 3D-related applications. Existing methods adopt a predictive coding architecture, which requires joint encoding to compress the corresponding disparity as well as residual information. This demands collaboration among cameras and enforces the epipolar geometric constraint between different views, which makes it challenging to deploy these methods in distributed camera systems with randomly overlapping fields of view. Meanwhile, distributed source coding theory indicates that efficient data compression of correlated sources can be achieved by independent encoding and joint decoding, which motivates us to design a learning-based distributed multi-view image coding (LDMIC) framework. With independent encoders, LDMIC introduces a simple yet effective joint context transfer module based on the cross-attention mechanism at the decoder to effectively capture the global inter-view correlations, which is insensitive to the geometric relationships between images. Experimental results show that LDMIC significantly outperforms both traditional and learning-based MIC methods while enjoying fast encoding speed. Code will be released at https://github.com/Xinjie-Q/LDMIC.
翻译:多视角图像压缩在3D相关应用中起着至关重要的作用。现有方法采用预测编码架构,需要联合编码相应的视差和残差信息。这需要摄像机之间的协作,并强制实现不同视角之间的极线几何约束,这使得在随机重叠视场的分布式摄像机系统中部署这些方法变得具有挑战性。与此同时,分布式源编码理论表明,通过独立编码和联合解码可以实现相关源的高效数据压缩,这激励我们设计了一种基于学习的分布式多视角图像编码(LDMIC)框架。通过独立编码器,LDMIC引入了一种基于交叉注意机制的简单但有效的联合上下文传递模块,以有效捕捉全局视图之间的相关性,同时不受几何关系的影响。实验结果表明,LDMIC在享有快速编码速度的同时,显著优于传统和基于学习的MIC方法。代码将发布在 https://github.com/Xinjie-Q/LDMIC.