TMGAN-PLC: 利用时间内存产生反效果网络的音频包装损耗隐藏 (TMGAN-PLC: Audio Packet Loss Concealment using Temporal Memory Generative Adversarial Network)

Real-time communications in packet-switched networks have become widely used in daily communication, while they inevitably suffer from network delays and data losses in constrained real-time conditions. To solve these problems, audio packet loss concealment (PLC) algorithms have been developed to mitigate voice transmission failures by reconstructing the lost information. Limited by the transmission latency and device memory, it is still intractable for PLC to accomplish high-quality voice reconstruction using a relatively small packet buffer. In this paper, we propose a temporal memory generative adversarial network for audio PLC, dubbed TMGAN-PLC, which is comprised of a novel nested-UNet generator and the time-domain/frequency-domain discriminators. Specifically, a combination of the nested-UNet and temporal feature-wise linear modulation is elaborately devised in the generator to finely adjust the intra-frame information and establish inter-frame temporal dependencies. To complement the missing speech content caused by longer loss bursts, we employ multi-stage gated vector quantizers to capture the correct content and reconstruct the near-real smooth audio. Extensive experiments on the PLC Challenge dataset demonstrate that the proposed method yields promising performance in terms of speech quality, intelligibility, and PLCMOS.

翻译：为解决这些问题,开发了语音包隐藏算法,以通过重建丢失的信息来减少语音传输失败。由于传输延迟和装置内存的限制,PLC仍难以用相对较小的包缓冲实现高质量语音重建。在本文件中,我们提议为音频PLC(称为TMGAN-PLC)建立一个时间记忆质变对抗网络,由新颖的嵌套UNet生成器和时空/频域区分器组成。具体来说,在发电机中精心设计了嵌套UNet和时地性线性调制组合,以微调内部信息,并建立起一个相对较小的包缓冲。为了补充长期损失爆发造成的缺失的语音内容,我们采用了多级封口矢量量量定量测试器,以捕取正确的内容,并重建近乎平稳的音频带。关于嵌套式-域/频率区分器的大规模实验,在高亮度的磁带效果中展示了PLC系统质量的预期性能。

相关内容

Networking

关注 22

Networking：IFIP International Conferences on Networking。 Explanation：国际网络会议。 Publisher：IFIP。 SIT： http://dblp.uni-trier.de/db/conf/networking/index.html

神经常微分方程教程，50页ppt，A brief tutorial on Neural ODEs

专知会员服务

74+阅读 · 2020年8月2日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

UC.Berkeley CS189讲义教材:《机器学习全面指南》，185页pdf

专知会员服务

162+阅读 · 2020年1月16日