使用本地风格调制调控和双路径 Pitch 编译器的语音演唱合成 (Expressive Singing Synthesis Using Local Style Token and Dual-path Pitch Encoder) - 专知论文

会员服务 ·

0

词元分析器 · 控制器 · 语音系统 · MoDELS · 论文 ·

2022 年 4 月 7 日

Expressive Singing Synthesis Using Local Style Token and Dual-path Pitch Encoder

翻译：使用本地风格调制调控和双路径 Pitch 编译器的语音演唱合成

Juheon Lee,Hyeong-Seok Choi,Kyogu Lee

from arxiv, 4 pages, Submitted to Interspeech 2022

This paper proposes a controllable singing voice synthesis system capable of generating expressive singing voice with two novel methodologies. First, a local style token module, which predicts frame-level style tokens from an input pitch and text sequence, is proposed to allow the singing voice system to control musical expression often unspecified in sheet music (e.g., breathing and intensity). Second, we propose a dual-path pitch encoder with a choice of two different pitch inputs: MIDI pitch sequence or f0 contour. Because the initial generation of a singing voice is usually executed by taking a MIDI pitch sequence, one can later extract an f0 contour from the generated singing voice and modify the f0 contour to a finer level as desired. Through quantitative and qualitative evaluations, we confirmed that the proposed model could control various musical expressions while not sacrificing the sound quality of the singing voice synthesis system.

翻译：本文提出一个可控的歌声合成系统,能够通过两种新颖的方法产生出发声的声音。首先,一个本地风格的象征性模块,从输入声道和文本序列中预测框架级风格符号,以允许歌声系统控制音乐表达,通常在床单音乐(如呼吸和强度)中未注明。其次,我们提出一个双路径的声调编码器,选择两种不同的音调输入:MIDI 音调序列或f0轮廓。由于最初一代的歌声通常通过采用 MIDI 音调序列来执行,人们可以稍后从生成的歌声中提取一个f0轮廓,并按需要将f0轮廓修改为更细的音调。通过定量和定性评估,我们确认拟议的模式可以控制各种音乐表达,同时不牺牲歌声合成系统的音质质量。

0

相关内容

词元分析器

词元分析器

MIT经典《线性代数》，584页pdf，Introduction to Linear Algebra, Fifth Edition, Gilbert Strang, 2016.

MIT经典《线性代数》，584页pdf，Introduction to Linear Algebra, Fifth Edition, Gilbert Strang, 2016.

专知会员服务

429+阅读 · 2021年1月11日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

【DeepMind】PolyGen: 一种三维网格的自回归生成模型，PolyGen: An Autoregressive Generative Model of 3D Meshes

【DeepMind】PolyGen: 一种三维网格的自回归生成模型，PolyGen: An Autoregressive Generative Model of 3D Meshes

专知会员服务

37+阅读 · 2020年2月27日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

中国图象图形学学会CSIG

0+阅读 · 2021年12月17日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

水牛瘤胃内热原体属甲烷菌的FISH－FACS分离技术构建及其代谢功能分析

国家自然科学基金

0+阅读 · 2014年12月31日

Vlasov-Poisson-Boltzmann方程研究

国家自然科学基金

0+阅读 · 2013年12月31日

高效免标记功能核酸传感器研究

国家自然科学基金

0+阅读 · 2013年12月31日

苯并(a)芘暴露产活性氧介导海洋鱼类抗菌肽的表达机制及其在核转录因子信号通路中的作用

国家自然科学基金

0+阅读 · 2012年12月31日

内蒙古地区3种松树外生菌根真菌多样性与分布格局

国家自然科学基金

0+阅读 · 2012年12月31日

强八元数矩阵代数与矢量传感器阵列多维信号处理

国家自然科学基金

0+阅读 · 2011年12月31日

以DPP-IV为靶点的钒配合物抑制剂设计、合成及作用机理研究

国家自然科学基金

0+阅读 · 2011年12月31日

三维爆炸与冲击问题差分格式的网格生成算法及软件开发

国家自然科学基金

0+阅读 · 2011年12月31日

微生物天然产物IMB0004和IMB0034抗HIV-1作用机制的研究

国家自然科学基金

0+阅读 · 2009年12月31日

汉语文语转换中语义与表现力联合建模

国家自然科学基金

0+阅读 · 2008年12月31日

3D-aware Image Synthesis via Learning Structural and Textural Representations

Arxiv

1+阅读 · 2022年4月18日

Dynamic Position Encoding for Transformers

Arxiv

1+阅读 · 2022年4月18日

Ingredient Extraction from Text in the Recipe Domain

Arxiv

0+阅读 · 2022年4月18日

The Z-axis, X-axis, Weight and Disambiguation Methods for Constructing Local Reference Frame in 3D Registration: An Evaluation

Arxiv

0+阅读 · 2022年4月17日

A Logical Analysis of Dynamic Dependence

Arxiv

0+阅读 · 2022年4月16日

Bridging Cross-Lingual Gaps During Leveraging the Multilingual Sequence-to-Sequence Pretraining for Text Generation

Arxiv

0+阅读 · 2022年4月16日

A review of path following control strategies for autonomous robotic vehicles: theory, simulations, and experiments

Arxiv

0+阅读 · 2022年4月15日

Transformers in Medical Image Analysis: A Review

Transformers in Medical Image Analysis: A Review

Arxiv

40+阅读 · 2022年2月24日

Deformable Style Transfer

Deformable Style Transfer

Arxiv

14+阅读 · 2020年3月24日

Multimodal Sentiment Analysis using Hierarchical Fusion with Context Modeling

Arxiv

11+阅读 · 2018年6月16日

VIP会员

文章信息

相关主题

词元分析器

相关VIP内容

MIT经典《线性代数》，584页pdf，Introduction to Linear Algebra, Fifth Edition, Gilbert Strang, 2016.

MIT经典《线性代数》，584页pdf，Introduction to Linear Algebra, Fifth Edition, Gilbert Strang, 2016.

专知会员服务

429+阅读 · 2021年1月11日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

【DeepMind】PolyGen: 一种三维网格的自回归生成模型，PolyGen: An Autoregressive Generative Model of 3D Meshes

【DeepMind】PolyGen: 一种三维网格的自回归生成模型，PolyGen: An Autoregressive Generative Model of 3D Meshes

专知会员服务

37+阅读 · 2020年2月27日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《生成式人工智能及其在防御性网络安全课程中的应用》

《缓解大语言模型（LLMs）幻觉：面向应用的检索增强生成（RAG）、推理与智能体系统综述》

《FPV武装无人机的战斗飞行艺术与科学》最新报告

《基于分层多智能体强化学习的逼真空战协同策略》

相关资讯

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

中国图象图形学学会CSIG

0+阅读 · 2021年12月17日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

3D-aware Image Synthesis via Learning Structural and Textural Representations

Arxiv

1+阅读 · 2022年4月18日

Dynamic Position Encoding for Transformers

Arxiv

1+阅读 · 2022年4月18日

Ingredient Extraction from Text in the Recipe Domain

Arxiv

0+阅读 · 2022年4月18日

The Z-axis, X-axis, Weight and Disambiguation Methods for Constructing Local Reference Frame in 3D Registration: An Evaluation

Arxiv

0+阅读 · 2022年4月17日

A Logical Analysis of Dynamic Dependence

Arxiv

0+阅读 · 2022年4月16日

Bridging Cross-Lingual Gaps During Leveraging the Multilingual Sequence-to-Sequence Pretraining for Text Generation

Arxiv

0+阅读 · 2022年4月16日

A review of path following control strategies for autonomous robotic vehicles: theory, simulations, and experiments

Arxiv

0+阅读 · 2022年4月15日

Transformers in Medical Image Analysis: A Review

Transformers in Medical Image Analysis: A Review

Arxiv

40+阅读 · 2022年2月24日

Deformable Style Transfer

Deformable Style Transfer

Arxiv

14+阅读 · 2020年3月24日

Multimodal Sentiment Analysis using Hierarchical Fusion with Context Modeling

Arxiv

11+阅读 · 2018年6月16日

相关基金

水牛瘤胃内热原体属甲烷菌的FISH－FACS分离技术构建及其代谢功能分析

国家自然科学基金

0+阅读 · 2014年12月31日

Vlasov-Poisson-Boltzmann方程研究

国家自然科学基金

0+阅读 · 2013年12月31日

高效免标记功能核酸传感器研究

国家自然科学基金

0+阅读 · 2013年12月31日

苯并(a)芘暴露产活性氧介导海洋鱼类抗菌肽的表达机制及其在核转录因子信号通路中的作用

国家自然科学基金

0+阅读 · 2012年12月31日

内蒙古地区3种松树外生菌根真菌多样性与分布格局

国家自然科学基金

0+阅读 · 2012年12月31日

强八元数矩阵代数与矢量传感器阵列多维信号处理

国家自然科学基金

0+阅读 · 2011年12月31日

以DPP-IV为靶点的钒配合物抑制剂设计、合成及作用机理研究

国家自然科学基金

0+阅读 · 2011年12月31日

三维爆炸与冲击问题差分格式的网格生成算法及软件开发

国家自然科学基金

0+阅读 · 2011年12月31日

微生物天然产物IMB0004和IMB0034抗HIV-1作用机制的研究

国家自然科学基金

0+阅读 · 2009年12月31日

汉语文语转换中语义与表现力联合建模

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员