Distributed source coding is the task of encoding an input in the absence of correlated side information that is only available to the decoder. Remarkably, Slepian and Wolf showed in 1973 that an encoder that has no access to the correlated side information can asymptotically achieve the same compression rate as when the side information is available at both the encoder and the decoder. While there is significant prior work on this topic in information theory, practical distributed source coding has been limited to synthetic datasets and specific correlation structures. Here we present a general framework for lossy distributed source coding that is agnostic to the correlation structure and can scale to high dimensions. Rather than relying on hand-crafted source-modeling, our method utilizes a powerful conditional deep generative model to learn the distributed encoder and decoder. We evaluate our method on realistic high-dimensional datasets and show substantial improvements in distributed compression performance.
翻译:发布源编码的任务是在缺少相关侧信息的情况下对输入进行编码。 值得注意的是, Slepian 和 Wolf 于1973年显示, 无法访问相关侧信息的编码器可以像在编码器和解码器同时提供侧信息时一样快速实现相同的压缩率。 虽然先前在信息理论中就这一专题做了大量工作,但实际发布源编码局限于合成数据集和特定相关结构。 我们在这里为丢失的分布源编码提供了一个通用框架,该框架对相关结构具有可识别性,并且可以推广到高维度。 我们的方法不是依靠手工制作的源建模,而是利用一个强大的、有条件的深基因化模型来学习分布的编码器和解码。 我们评估了我们关于现实的高维数据集的方法,并展示了分布压缩性能的大幅改进。