WeSinger: 数据增强的配辅助性损失的歌声合成合成 (WeSinger: Data-augmented Singing Voice Synthesis with Auxiliary Losses) - 专知论文

会员服务 ·

0

模型评估 · HTTPS · 损失 · 知识 (knowledge) · Performer ·

2022 年 6 月 25 日

WeSinger: Data-augmented Singing Voice Synthesis with Auxiliary Losses

翻译：WeSinger: 数据增强的配辅助性损失的歌声合成合成

Zewang Zhang,Yibin Zheng,Xinhui Li,Li Lu

from arxiv, accepted at InterSpeech2022

In this paper, we develop a new multi-singer Chinese neural singing voice synthesis (SVS) system named WeSinger. To improve the accuracy and naturalness of synthesized singing voice, we design several specifical modules and techniques: 1) A deep bi-directional LSTM-based duration model with multi-scale rhythm loss and post-processing step; 2) A Transformer-alike acoustic model with progressive pitch-weighted decoder loss; 3) a 24 kHz pitch-aware LPCNet neural vocoder to produce high-quality singing waveforms; 4) A novel data augmentation method with multi-singer pre-training for stronger robustness and naturalness. To our knowledge, WeSinger is the first SVS system to adopt 24 kHz LPCNet and multi-singer pre-training simultaneously. Both quantitative and qualitative evaluation results demonstrate the effectiveness of WeSinger in terms of accuracy and naturalness, and WeSinger achieves state-of-the-art performance on the recent public Chinese singing corpus Opencpop\footnote{https://wenet.org.cn/opencpop/}. Some synthesized singing samples are available online\footnote{https://zzw922cn.github.io/wesinger/}.

翻译：在本文中,我们开发了名为WeSinger的中国神经神经歌声合成(SVS)系统。为了提高合成歌声的准确性和自然性,我们设计了若干具体的模块和技术:1)基于双向LSTM的深度双向持续时间模型,具有多种规模的节奏损失和后处理步骤;2)类似于变压器的声学模型,具有渐进式声学加权脱coder损失;3)一个24千赫兹声觉LPCNet神经蒸气器,以产生高质量的歌声波形;4)一种新型数据增强方法,配有多声器预培训,以培养更强的稳健性和自然性。据我们所知,WSinger是第一个同时采用24千赫兹LPCNet和多声学预培训的SVS系统。定量和定性评估结果都表明WSinger在准确性和自然性方面的有效性,WeSinger在近期公共歌机 Opickfoote{http://wenetnetc samps.comproduning/s.orgs.orgs/www/wwwproplopenc/s.orgs.orgs.org/s.org/s。

0

相关内容

模型评估

机器学习系统设计系统评估标准

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

“CVPR 2021 接受论文列表 1663篇论文都在这了

专知会员服务

32+阅读 · 2021年6月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

征稿 | International Joint Conference on Knowledge Graphs (IJCKG)

征稿 | International Joint Conference on Knowledge Graphs (IJCKG)

开放知识图谱

2+阅读 · 2022年5月20日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Anderson型多酸的不对称修饰及可控组装研究

国家自然科学基金

1+阅读 · 2014年12月31日

miRNAs介导CD147基因3’-UTR多态性对缺血性脑卒中的关联研究

国家自然科学基金

0+阅读 · 2013年12月31日

脆性X综合症模型小鼠雌激素ER-β调节突触可塑性异常的机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

可压缩湍流粒子输运的拉格朗日（Lagrangian）研究

国家自然科学基金

0+阅读 · 2013年12月31日

面板数据模型的最优设计

国家自然科学基金

0+阅读 · 2013年12月31日

青藏高原地表非绝热加热遥感参数化及其时空演变规律

国家自然科学基金

0+阅读 · 2013年12月31日

proBDNF通过P75NTR/sortilin受体促进心肌缺血再灌注损伤的作用及机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

北极冰间水道反演和敏感性试验

国家自然科学基金

0+阅读 · 2012年12月31日

过渡金属催化卤代芳烃对芳醛的Barbier类型反应研究

国家自然科学基金

0+阅读 · 2009年12月31日

新基因Chin1在神经细胞凋亡中的作用及其分子机制

国家自然科学基金

0+阅读 · 2008年12月31日

Supernet Training for Federated Image Classification under System Heterogeneity

Arxiv

0+阅读 · 2022年8月16日

Differentiable WORLD Synthesizer-based Neural Vocoder With Application To End-To-End Audio Style Transfer

Differentiable WORLD Synthesizer-based Neural Vocoder With Application To End-To-End Audio Style Transfer

Arxiv

0+阅读 · 2022年8月15日

Efficient Task-Oriented Dialogue Systems with Response Selection as an Auxiliary Task

Arxiv

0+阅读 · 2022年8月15日

CANF-VC: Conditional Augmented Normalizing Flows for Video Compression

Arxiv

0+阅读 · 2022年8月15日

Layout-Bridging Text-to-Image Synthesis

Arxiv

0+阅读 · 2022年8月12日

Sequence Level Contrastive Learning for Text Summarization

Sequence Level Contrastive Learning for Text Summarization

Arxiv

14+阅读 · 2021年9月24日

Few-shot Natural Language Generation for Task-Oriented Dialog

Few-shot Natural Language Generation for Task-Oriented Dialog

Arxiv

30+阅读 · 2020年2月27日

On Feature Normalization and Data Augmentation

On Feature Normalization and Data Augmentation

Arxiv

15+阅读 · 2020年2月25日

Differentiable Reasoning on Large Knowledge Bases and Natural Language

Arxiv

12+阅读 · 2019年12月17日

BERT for Joint Intent Classification and Slot Filling

Arxiv

12+阅读 · 2019年2月28日

VIP会员

文章信息

相关主题

知识 (knowledge)

相关VIP内容

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

“CVPR 2021 接受论文列表 1663篇论文都在这了

专知会员服务

32+阅读 · 2021年6月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

人工智能治理的未来

模态感知的特征匹配：单一模态与跨模态技术的全面综述

无监督行人重识别研究综述

【牛津博士论文】面向神经影像应用的可扩展且可解释的空间模型

相关资讯

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

征稿 | International Joint Conference on Knowledge Graphs (IJCKG)

征稿 | International Joint Conference on Knowledge Graphs (IJCKG)

开放知识图谱

2+阅读 · 2022年5月20日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

Supernet Training for Federated Image Classification under System Heterogeneity

Arxiv

0+阅读 · 2022年8月16日

Differentiable WORLD Synthesizer-based Neural Vocoder With Application To End-To-End Audio Style Transfer

Differentiable WORLD Synthesizer-based Neural Vocoder With Application To End-To-End Audio Style Transfer

Arxiv

0+阅读 · 2022年8月15日

Efficient Task-Oriented Dialogue Systems with Response Selection as an Auxiliary Task

Arxiv

0+阅读 · 2022年8月15日

CANF-VC: Conditional Augmented Normalizing Flows for Video Compression

Arxiv

0+阅读 · 2022年8月15日

Layout-Bridging Text-to-Image Synthesis

Arxiv

0+阅读 · 2022年8月12日

Sequence Level Contrastive Learning for Text Summarization

Sequence Level Contrastive Learning for Text Summarization

Arxiv

14+阅读 · 2021年9月24日

Few-shot Natural Language Generation for Task-Oriented Dialog

Few-shot Natural Language Generation for Task-Oriented Dialog

Arxiv

30+阅读 · 2020年2月27日

On Feature Normalization and Data Augmentation

On Feature Normalization and Data Augmentation

Arxiv

15+阅读 · 2020年2月25日

Differentiable Reasoning on Large Knowledge Bases and Natural Language

Arxiv

12+阅读 · 2019年12月17日

BERT for Joint Intent Classification and Slot Filling

Arxiv

12+阅读 · 2019年2月28日

相关基金

Anderson型多酸的不对称修饰及可控组装研究

国家自然科学基金

1+阅读 · 2014年12月31日

miRNAs介导CD147基因3’-UTR多态性对缺血性脑卒中的关联研究

国家自然科学基金

0+阅读 · 2013年12月31日

脆性X综合症模型小鼠雌激素ER-β调节突触可塑性异常的机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

可压缩湍流粒子输运的拉格朗日（Lagrangian）研究

国家自然科学基金

0+阅读 · 2013年12月31日

面板数据模型的最优设计

国家自然科学基金

0+阅读 · 2013年12月31日

青藏高原地表非绝热加热遥感参数化及其时空演变规律

国家自然科学基金

0+阅读 · 2013年12月31日

proBDNF通过P75NTR/sortilin受体促进心肌缺血再灌注损伤的作用及机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

北极冰间水道反演和敏感性试验

国家自然科学基金

0+阅读 · 2012年12月31日

过渡金属催化卤代芳烃对芳醛的Barbier类型反应研究

国家自然科学基金

0+阅读 · 2009年12月31日

新基因Chin1在神经细胞凋亡中的作用及其分子机制

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员