Spectral sub-bands do not portray the same perceptual relevance. In audio coding, it is therefore desirable to have independent control over each of the constituent bands so that bitrate assignment and signal reconstruction can be achieved efficiently. In this work, we present a novel neural audio coding network that natively supports a multi-band coding paradigm. Our model extends the idea of compressed skip connections in the U-Net-based codec, allowing for independent control over both core and high band-specific reconstructions and bit allocation. Our system reconstructs the full-band signal mainly from the condensed core-band code, therefore exploiting and showcasing its bandwidth extension capabilities to its fullest. Meanwhile, the low-bitrate high-band code helps the high-band reconstruction similarly to MPEG audio codecs' spectral bandwidth replication. MUSHRA tests show that the proposed model not only improves the quality of the core band by explicitly assigning more bits to it but retains a good quality in the high-band as well.
翻译:在音频编码中,最好能独立控制每个成份带,以便有效地实现比特拉派和信号重建。在这项工作中,我们展示了一个新的神经音频编码网络,以本地支持多波段编码模式。我们的模型扩展了基于 U-Net 的代码中压缩跳过连接的概念,允许独立控制核心和高频带的重建以及位谱分配。我们的系统主要根据压缩核心带码重建全频带信号,从而充分利用和展示其带宽扩展能力。与此同时,低位高频谱代码有助于高频段重建,类似于MPEG 音频码频带带带宽复制。MUSHRA测试显示,拟议的模型不仅通过明确指定更多比特来提高核心带的质量,而且还保留高频带的高质量。</s>