将对称进化变换网络用于语音语音转换到语音风格传输 (Leveraging Symmetrical Convolutional Transformer Networks for Speech to Singing Voice Style Transfer) - 专知论文

会员服务 ·

0

Performer · MoDELS · Networking · 变换 · 基准 ·

2022 年 8 月 26 日

Leveraging Symmetrical Convolutional Transformer Networks for Speech to Singing Voice Style Transfer

翻译：将对称进化变换网络用于语音语音转换到语音风格传输

Shrutina Agarwal,Sriram Ganapathy,Naoya Takahashi

from arxiv, accepted to INTERSPEECH 2022

In this paper, we propose a model to perform style transfer of speech to singing voice. Contrary to the previous signal processing-based methods, which require high-quality singing templates or phoneme synchronization, we explore a data-driven approach for the problem of converting natural speech to singing voice. We develop a novel neural network architecture, called SymNet, which models the alignment of the input speech with the target melody while preserving the speaker identity and naturalness. The proposed SymNet model is comprised of symmetrical stack of three types of layers - convolutional, transformer, and self-attention layers. The paper also explores novel data augmentation and generative loss annealing methods to facilitate the model training. Experiments are performed on the NUS and NHSS datasets which consist of parallel data of speech and singing voice. In these experiments, we show that the proposed SymNet model improves the objective reconstruction quality significantly over the previously published methods and baseline architectures. Further, a subjective listening test confirms the improved quality of the audio obtained using the proposed approach (absolute improvement of 0.37 in mean opinion score measure over the baseline system).

翻译：在本文中,我们提出一种模式,将语言风格转换为歌声。与以前基于信号的处理方法相反,这需要高质量的歌唱模板或电话同步。我们探索一种数据驱动方法,解决将自然语言转换为歌唱的声音的问题。我们开发了一个新的神经网络结构,叫做SymNet,它将输入的语句与目标旋律相匹配,同时保持演讲者的身份和自然性。拟议的SymNet模式由三种层次的对称堆叠组成,即:进化层、变压器和自留层。本文还探讨了新的数据增强和基因损耗前置法,以促进示范培训。在NUS和NHSS数据集上进行了实验,其中包括语音和歌唱声的平行数据。在这些实验中,我们表明拟议的SymNet模型大大改进了先前出版的方法和基线结构的客观重建质量。此外,主观监听测试证实了使用拟议方法获得的音质质量的提高(基线系统平均对0.37的评分尺度进行了改进)。

0

相关内容

Performer

神经网络序列数据建模，229页ppt，Modeling Sequential Data with Neural Nets

神经网络序列数据建模，229页ppt，Modeling Sequential Data with Neural Nets

专知会员服务

67+阅读 · 2020年7月25日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【深度学习架构、模型和技巧集合(TensorFlow/PyTorch)】’Deep Learning Models - A collection of various deep learning architectures, models, and tips'

【深度学习架构、模型和技巧集合(TensorFlow/PyTorch)】’Deep Learning Models - A collection of various deep learning architectures, models, and tips'

专知会员服务

58+阅读 · 2020年1月25日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

【ICIG2021】Latest News & Announcements of the Plenary Talk2

【ICIG2021】Latest News & Announcements of the Plenary Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年11月2日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

姜黄素(Curcumin)对断奶仔猪肠道黏膜免疫损伤调节的表观遗传组学机理研究

国家自然科学基金

0+阅读 · 2014年12月31日

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

共生菌维持肺免疫稳态及其抵抗流感病毒急性肺损伤的机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

Erdos-Sos猜想及几个相关的极值组合问题

国家自然科学基金

0+阅读 · 2012年12月31日

非凸Hamilton系统的Aubry-Mather理论

国家自然科学基金

0+阅读 · 2012年12月31日

拟南芥DIF（DRIP1-Interacting Factor）在胁迫信号应答中的功能分析

国家自然科学基金

0+阅读 · 2012年12月31日

可变带宽交换光网络的自适应机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于Takagi-Sugeno模型的非线性系统数据驱动最优控制方法适定性的研究

国家自然科学基金

0+阅读 · 2012年12月31日

多标签分类中的特征提取与选择方法研究

国家自然科学基金

3+阅读 · 2012年12月31日

磁性Pickering乳液界面流变学研究

国家自然科学基金

0+阅读 · 2008年12月31日

Language-agnostic Code-Switching in End-To-End Speech Recognition

Arxiv

0+阅读 · 2022年10月17日

LR-Net: A Block-based Convolutional Neural Network for Low-Resolution Image Classification

Arxiv

0+阅读 · 2022年10月17日

Investigating the Robustness of Natural Language Generation from Logical Forms via Counterfactual Samples

Arxiv

0+阅读 · 2022年10月16日

TransFusion: Transcribing Speech with Multinomial Diffusion

Arxiv

0+阅读 · 2022年10月14日

Hierarchical Diffusion Models for Singing Voice Neural Vocoder

Arxiv

0+阅读 · 2022年10月14日

Anonymizing Speech with Generative Adversarial Networks to Preserve Speaker Privacy

Arxiv

0+阅读 · 2022年10月13日

A Neural Mean Embedding Approach for Back-door and Front-door Adjustment

Arxiv

0+阅读 · 2022年10月12日

Parameter Averaging for Feature Ranking

Arxiv

0+阅读 · 2022年10月12日

Aspect-based Sentiment Classification with Aspect-specific Graph Convolutional Networks

Arxiv

11+阅读 · 2019年9月8日

Semi-supervised Node Classification via Hierarchical Graph Convolutional Networks

Arxiv

14+阅读 · 2019年3月5日

VIP会员

文章信息

相关主题

相关VIP内容

神经网络序列数据建模，229页ppt，Modeling Sequential Data with Neural Nets

神经网络序列数据建模，229页ppt，Modeling Sequential Data with Neural Nets

专知会员服务

67+阅读 · 2020年7月25日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【深度学习架构、模型和技巧集合(TensorFlow/PyTorch)】’Deep Learning Models - A collection of various deep learning architectures, models, and tips'

【深度学习架构、模型和技巧集合(TensorFlow/PyTorch)】’Deep Learning Models - A collection of various deep learning architectures, models, and tips'

专知会员服务

58+阅读 · 2020年1月25日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《巡飞弹药（爆炸性无人机）威胁态势分析》最新24页报告

《军用后勤无人机：破解战场运输挑战的创新方案》

人工智能战争：以色列、伊朗与新型AI战争形态

《俄乌战争：现代战争未来的启示与经验》

相关资讯

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

【ICIG2021】Latest News & Announcements of the Plenary Talk2

【ICIG2021】Latest News & Announcements of the Plenary Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年11月2日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

Language-agnostic Code-Switching in End-To-End Speech Recognition

Arxiv

0+阅读 · 2022年10月17日

LR-Net: A Block-based Convolutional Neural Network for Low-Resolution Image Classification

Arxiv

0+阅读 · 2022年10月17日

Investigating the Robustness of Natural Language Generation from Logical Forms via Counterfactual Samples

Arxiv

0+阅读 · 2022年10月16日

TransFusion: Transcribing Speech with Multinomial Diffusion

Arxiv

0+阅读 · 2022年10月14日

Hierarchical Diffusion Models for Singing Voice Neural Vocoder

Arxiv

0+阅读 · 2022年10月14日

Anonymizing Speech with Generative Adversarial Networks to Preserve Speaker Privacy

Arxiv

0+阅读 · 2022年10月13日

A Neural Mean Embedding Approach for Back-door and Front-door Adjustment

Arxiv

0+阅读 · 2022年10月12日

Parameter Averaging for Feature Ranking

Arxiv

0+阅读 · 2022年10月12日

Aspect-based Sentiment Classification with Aspect-specific Graph Convolutional Networks

Arxiv

11+阅读 · 2019年9月8日

Semi-supervised Node Classification via Hierarchical Graph Convolutional Networks

Arxiv

14+阅读 · 2019年3月5日

相关基金

姜黄素(Curcumin)对断奶仔猪肠道黏膜免疫损伤调节的表观遗传组学机理研究

国家自然科学基金

0+阅读 · 2014年12月31日

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

共生菌维持肺免疫稳态及其抵抗流感病毒急性肺损伤的机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

Erdos-Sos猜想及几个相关的极值组合问题

国家自然科学基金

0+阅读 · 2012年12月31日

非凸Hamilton系统的Aubry-Mather理论

国家自然科学基金

0+阅读 · 2012年12月31日

拟南芥DIF（DRIP1-Interacting Factor）在胁迫信号应答中的功能分析

国家自然科学基金

0+阅读 · 2012年12月31日

可变带宽交换光网络的自适应机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于Takagi-Sugeno模型的非线性系统数据驱动最优控制方法适定性的研究

国家自然科学基金

0+阅读 · 2012年12月31日

多标签分类中的特征提取与选择方法研究

国家自然科学基金

3+阅读 · 2012年12月31日

磁性Pickering乳液界面流变学研究

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员