Multi-view image compression plays a critical role in 3D-related applications. Existing methods adopt a predictive coding architecture, which requires joint encoding to compress the corresponding disparity as well as residual information. This demands collaboration among cameras and enforces the epipolar geometric constraint between different views, which makes it challenging to deploy these methods in distributed camera systems with randomly overlapping fields of view. Meanwhile, distributed source coding theory indicates that efficient data compression of correlated sources can be achieved by independent encoding and joint decoding, which motivates us to design a learning-based distributed multi-view image coding (LDMIC) framework. With independent encoders, LDMIC introduces a simple yet effective joint context transfer module based on the cross-attention mechanism at the decoder to effectively capture the global inter-view correlations, which is insensitive to the geometric relationships between images. Experimental results show that LDMIC significantly outperforms both traditional and learning-based MIC methods while enjoying fast encoding speed. Code will be released at https://github.com/Xinjie-Q/LDMIC.
翻译:多视图图像压缩在与 3D 相关的应用程序中发挥着关键作用。 现有方法采用了一种预测编码结构,需要联合编码以压缩相应的差异和剩余信息。 这就要求相机之间开展合作,并在不同观点之间强制实施上极几何限制,这就使得在分布式相机系统中使用这些方法具有挑战性,其视野随机重叠。 同时,分布源编码理论表明,通过独立编码和联合解码,可以实现相关源的有效数据压缩,这促使我们设计一个基于学习的分布式多视图图像编码框架。 使用独立的编码器, LDMIC引入了一个简单而有效的联合环境传输模块,该模块基于在解码器的交叉使用机制,以有效捕捉全球不同视图的相互关系,这对图像之间的几何关系不敏感。 实验结果显示, LDMIC在享受快速编码速度的同时,大大超越了传统和基于学习的 MIC方法。 代码将在 https://github.com/Xinjie-Q/LDMIC 发布。