面向语义通信的变长联合信源信道编码 (Variable-Length Joint Source-Channel Coding for Semantic Communication)

This paper investigates a key challenge faced by joint source-channel coding (JSCC) in digital semantic communication (SemCom): the incompatibility between existing JSCC schemes that yield continuous encoded representations and digital systems that employ discrete variable-length codewords. It further results in feasibility issues in achieving physical bit-level rate control via such JSCC approaches for efficient semantic transmission. In this paper, we propose a novel end-to-end coding (E2EC) framework to tackle it. The semantic coding problem is formed by extending the information bottleneck (IB) theory over noisy channels, which is a tradeoff between bit-level communication rate and semantic distortion. With a structural decomposition of encoding to handle code length and content respectively, we can construct an end-to-end trainable encoder that supports the direct compression of a data source into a finite codebook. To optimize our E2EC across non-differentiable operations, e.g., sampling, we use the powerful policy gradient to support gradient-based updates. Experimental results illustrate that E2EC achieves high inference quality with low bit rates, outperforming representative baselines compatible with digital SemCom systems.

翻译：本文研究了数字语义通信中联合信源信道编码面临的一个关键挑战：现有JSCC方案产生连续编码表示，与采用离散变长码字的数字系统不兼容。这进一步导致通过此类JSCC方法实现物理比特级速率控制以进行高效语义传输存在可行性问题。本文提出一种新颖的端到端编码框架来解决该问题。通过将信息瓶颈理论扩展至噪声信道，构建了语义编码问题，该问题本质上是比特级通信速率与语义失真之间的权衡。通过对编码结构进行分解以分别处理码长和内容，我们构建了一个端到端可训练的编码器，支持将数据源直接压缩至有限码本。为优化E2EC中不可微操作（如采样），我们采用策略梯度方法支持基于梯度的更新。实验结果表明，E2EC在低比特率下实现了高推理质量，优于与数字语义通信系统兼容的代表性基线方法。