声波Stream: 终端到终端神经音频编码c (SoundStream: An End-to-End Neural Audio Codec)

We present SoundStream, a novel neural audio codec that can efficiently compress speech, music and general audio at bitrates normally targeted by speech-tailored codecs. SoundStream relies on a model architecture composed by a fully convolutional encoder/decoder network and a residual vector quantizer, which are trained jointly end-to-end. Training leverages recent advances in text-to-speech and speech enhancement, which combine adversarial and reconstruction losses to allow the generation of high-quality audio content from quantized embeddings. By training with structured dropout applied to quantizer layers, a single model can operate across variable bitrates from 3kbps to 18kbps, with a negligible quality loss when compared with models trained at fixed bitrates. In addition, the model is amenable to a low latency implementation, which supports streamable inference and runs in real time on a smartphone CPU. In subjective evaluations using audio at 24kHz sampling rate, SoundStream at 3kbps outperforms Opus at 12kbps and approaches EVS at 9.6kbps. Moreover, we are able to perform joint compression and enhancement either at the encoder or at the decoder side with no additional latency, which we demonstrate through background noise suppression for speech.

翻译：我们展示了“SoundStream”这个新型神经音调码器,它能有效地压缩语言、音乐和一般音频,通常以语音定制的调制解码器为目标。“SoundStream”依靠由完全进化的编码/解码网络和残余矢量量量量计算器组成的模型结构,这些模型由经过共同培训的端对端至端共同培训。培训利用了文字对语音和语音增强方面的最新进展,这些进展结合了对抗性损失和重建性损失,以便从量化的嵌入中生成高质量的音频内容。通过对质层应用结构化的退出培训,单一模型可以在3kbps至18kbps的变异位器上运行,与固定比位数模型相比,其质量损失微不足道。此外,该模型可用于低液度执行,支持可流的推断,并实时运行在智能手机的CPU上。在使用24kHz取样率进行主观评价时,在3kbperforms外层应用了结构化的音频,因此,单一模型可在12kder Oppermillement 上进行额外的反压。

相关内容

MoDELS

关注 43

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/