将 " 零热多 " 改为 " 多声转换 ",自用VAE (Enhancing Zero-Shot Many to Many Voice Conversion with Self-Attention VAE) - 专知论文

会员服务 ·

0

变分自编码 · 正则化项 · 层 · MoDELS · Networking ·

2022 年 8 月 22 日

Enhancing Zero-Shot Many to Many Voice Conversion with Self-Attention VAE

翻译：将 " 零热多 " 改为 " 多声转换 ",自用VAE

Ziang Long,Yunling Zheng,Meng Yu,Jack Xin

Variational auto-encoder (VAE) is an effective neural network architecture to disentangle a speech utterance into speaker identity and linguistic content latent embeddings, then generate an utterance for a target speaker from that of a source speaker. This is possible by concatenating the identity embedding of the target speaker and the content embedding of the source speaker uttering a desired sentence. In this work, we propose to improve VAE models with self-attention and structural regularization (RGSM). Specifically, we found a suitable location of VAE's decoder to add a self-attention layer for incorporating non-local information in generating a converted utterance and hiding the source speaker's identity. We applied relaxed group-wise splitting method (RGSM) to regularize network weights and remarkably enhance generalization performance. In experiments of zero-shot many-to-many voice conversion task on VCTK data set, with the self-attention layer and relaxed group-wise splitting method, our model achieves a gain of speaker classification accuracy on unseen speakers by 28.3\% while slightly improved conversion voice quality in terms of MOSNet scores. Our encouraging findings point to future research on integrating more variety of attention structures in VAE framework while controlling model size and overfitting for advancing zero-shot many-to-many voice conversions.

翻译：自动读取器(VAE)是一个有效的神经网络架构,将语音表达器分解成语音身份和语言内容潜伏的嵌入器,然后从源演讲者的角度为目标演讲者提供话语,这可以通过将目标演讲者的身份嵌入和源演讲者表达想要的句子的内容嵌入等同起来来实现。在这项工作中,我们提议用自我注意和结构规范(RGSM)来改进 VAE 模式。具体地说,我们找到了VAE 解调器的适当位置,以添加一个自我注意层,将非本地信息纳入生成转换的语音表达器和隐藏源演讲者的身份。我们采用了宽松的分组分裂法(RGSM)来规范网络重量和源演讲者表达器内容嵌入内容。在试验VCTK 数据集的零点多到中语音转换任务时,用自我注意层和群体分裂法宽松的方法,我们的模式在将远方发言人的分类精确度上增加了28.3-10分,同时将语音质量略地转换成未来SNet框架,同时将许多项研究结果的零位转换成升级。

0

相关内容

变分自编码

变分自编码

NeurlPS 2022 | 自然语言处理相关论文分类整理

NeurlPS 2022 | 自然语言处理相关论文分类整理

专知会员服务

51+阅读 · 2022年10月2日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

93+阅读 · 2020年2月12日

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

专知会员服务

15+阅读 · 2019年10月23日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

征稿 | CFP：Special Issue of NLP and KG(JCR Q2，IF2.67)

征稿 | CFP：Special Issue of NLP and KG(JCR Q2，IF2.67)

开放知识图谱

1+阅读 · 2022年4月4日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

中国图象图形学学会CSIG

0+阅读 · 2021年12月17日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

水体重金属在螯合秸秆纤维超微结构中的吸附机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

Anderson型多酸的不对称修饰及可控组装研究

国家自然科学基金

1+阅读 · 2014年12月31日

银/氧化铝催化乙醇选择性还原氮氧化物微观结构与构效关系研究

国家自然科学基金

0+阅读 · 2012年12月31日

酰胺键骨架构建的手性催化剂及其催化的不对称串联环化反应研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于协作机制的无线多媒体网络效用提升方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

茧蜂病毒调控寄主粒细胞hemichannel与凋亡相互关系的研究

国家自然科学基金

0+阅读 · 2012年12月31日

常温条件下甲烷直接制备甲醇的电化学反应系统和机理研究

国家自然科学基金

0+阅读 · 2011年12月31日

公共场所病原微生物气溶胶人群感染风险定量评价方法研究

国家自然科学基金

0+阅读 · 2011年12月31日

基于语言理解的机器翻译方法研究

国家自然科学基金

2+阅读 · 2009年12月31日

气溶胶对夜间大气化学过程和NOx清除率影响研究

国家自然科学基金

0+阅读 · 2009年12月31日

A Systematic Survey on Deep Generative Models for Graph Generation

Arxiv

18+阅读 · 2022年10月4日

Relative representations enable zero-shot latent space communication

Arxiv

0+阅读 · 2022年9月30日

Lossy Image Compression with Conditional Diffusion Models

Arxiv

0+阅读 · 2022年9月30日

A Survey of Adversarial Learning on Graphs

Arxiv

38+阅读 · 2020年3月10日

Differentiable Reasoning on Large Knowledge Bases and Natural Language

Arxiv

12+阅读 · 2019年12月17日

Enhanced Meta-Learning for Cross-lingual Named Entity Recognition with Minimal Resources

Arxiv

13+阅读 · 2019年11月14日

Self-Attention with Relative Position Representations

Arxiv

27+阅读 · 2018年4月12日

End-to-End Multi-Task Learning with Attention

Arxiv

19+阅读 · 2018年3月28日

Distance-based Self-Attention Network for Natural Language Inference

Arxiv

10+阅读 · 2017年12月6日

Deep Semantic Role Labeling with Self-Attention

Arxiv

13+阅读 · 2017年12月5日

VIP会员

文章信息

相关主题

变分自编码

相关VIP内容

NeurlPS 2022 | 自然语言处理相关论文分类整理

NeurlPS 2022 | 自然语言处理相关论文分类整理

专知会员服务

51+阅读 · 2022年10月2日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

93+阅读 · 2020年2月12日

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

专知会员服务

15+阅读 · 2019年10月23日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《物联网（IoT）中的无人机通信高效控制》135页

《在GNSS信号降级环境中利用共识实现无人机集群稳健协调》

中程单向攻击无人机的战略意义：俄乌战争启示

《面向无人机集群的避障动态传感器覆盖算法》最新38页

相关资讯

征稿 | CFP：Special Issue of NLP and KG(JCR Q2，IF2.67)

征稿 | CFP：Special Issue of NLP and KG(JCR Q2，IF2.67)

开放知识图谱

1+阅读 · 2022年4月4日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

中国图象图形学学会CSIG

0+阅读 · 2021年12月17日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

A Systematic Survey on Deep Generative Models for Graph Generation

Arxiv

18+阅读 · 2022年10月4日

Relative representations enable zero-shot latent space communication

Arxiv

0+阅读 · 2022年9月30日

Lossy Image Compression with Conditional Diffusion Models

Arxiv

0+阅读 · 2022年9月30日

A Survey of Adversarial Learning on Graphs

Arxiv

38+阅读 · 2020年3月10日

Differentiable Reasoning on Large Knowledge Bases and Natural Language

Arxiv

12+阅读 · 2019年12月17日

Enhanced Meta-Learning for Cross-lingual Named Entity Recognition with Minimal Resources

Arxiv

13+阅读 · 2019年11月14日

Self-Attention with Relative Position Representations

Arxiv

27+阅读 · 2018年4月12日

End-to-End Multi-Task Learning with Attention

Arxiv

19+阅读 · 2018年3月28日

Distance-based Self-Attention Network for Natural Language Inference

Arxiv

10+阅读 · 2017年12月6日

Deep Semantic Role Labeling with Self-Attention

Arxiv

13+阅读 · 2017年12月5日

相关基金

水体重金属在螯合秸秆纤维超微结构中的吸附机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

Anderson型多酸的不对称修饰及可控组装研究

国家自然科学基金

1+阅读 · 2014年12月31日

银/氧化铝催化乙醇选择性还原氮氧化物微观结构与构效关系研究

国家自然科学基金

0+阅读 · 2012年12月31日

酰胺键骨架构建的手性催化剂及其催化的不对称串联环化反应研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于协作机制的无线多媒体网络效用提升方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

茧蜂病毒调控寄主粒细胞hemichannel与凋亡相互关系的研究

国家自然科学基金

0+阅读 · 2012年12月31日

常温条件下甲烷直接制备甲醇的电化学反应系统和机理研究

国家自然科学基金

0+阅读 · 2011年12月31日

公共场所病原微生物气溶胶人群感染风险定量评价方法研究

国家自然科学基金

0+阅读 · 2011年12月31日

基于语言理解的机器翻译方法研究

国家自然科学基金

2+阅读 · 2009年12月31日

气溶胶对夜间大气化学过程和NOx清除率影响研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员