MaskSpeech:有背景意识的言语综述与面具战略 (MaskedSpeech: Context-aware Speech Synthesis with Masking Strategy) - 专知论文

会员服务 ·

0

掩码 · 语音合成 · INFORMS · 连结 · 音素 ·

2022 年 11 月 11 日

MaskedSpeech: Context-aware Speech Synthesis with Masking Strategy

翻译：MaskSpeech:有背景意识的言语综述与面具战略

Ya-Jie Zhang,Wei Song,Yanghao Yue,Zhengchen Zhang,Youzheng Wu,Xiaodong He

Humans often speak in a continuous manner which leads to coherent and consistent prosody properties across neighboring utterances. However, most state-of-the-art speech synthesis systems only consider the information within each sentence and ignore the contextual semantic and acoustic features. This makes it inadequate to generate high-quality paragraph-level speech which requires high expressiveness and naturalness. To synthesize natural and expressive speech for a paragraph, a context-aware speech synthesis system named MaskedSpeech is proposed in this paper, which considers both contextual semantic and acoustic features. Inspired by the masking strategy in the speech editing research, the acoustic features of the current sentence are masked out and concatenated with those of contextual speech, and further used as additional model input. The phoneme encoder takes the concatenated phoneme sequence from neighboring sentences as input and learns fine-grained semantic information from contextual text. Furthermore, cross-utterance coarse-grained semantic features are employed to improve the prosody generation. The model is trained to reconstruct the masked acoustic features with the augmentation of both the contextual semantic and acoustic features. Experimental results demonstrate that the proposed MaskedSpeech outperformed the baseline system significantly in terms of naturalness and expressiveness.

翻译：人类经常以连续的方式说话,从而在相邻的语句中形成一致和一致的流言特性。然而,大多数最先进的语音合成系统只考虑每个句子中的信息,而忽略了语义和声学特点。这使得它不足以产生高质量的段落级语言,这要求高清晰度和自然性。要合成段落的自然和表达式语言,本文件提出了一种背景觉悟语音合成系统,它既考虑到背景语义特征,也考虑到声学特征。在语音编辑研究中遮盖战略的启发下,当前句子的声学特征被遮盖了,与背景语句的音调特征相融合,并被进一步用作额外的模型输入。电话编码器将邻居语句中的相近音频序列作为输入,并从背景文字中学习精密的语义信息。此外,本文件还采用了跨宽度、偏差的语义和声学特征来改进动的一代。该模型经过训练,可以重建隐藏的声学特征,同时演示了背景语系基础和深层图像的升级。

0

相关内容

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

【深度学习表格检测、信息提取和结构化】《Table Detection, Information Extraction and Structuring using Deep Learning》by Vihar Kurama

专知会员服务

38+阅读 · 2020年1月23日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

ExBert — 可视化分析Transformer学到的表示

ExBert — 可视化分析Transformer学到的表示

专知会员服务

32+阅读 · 2019年10月16日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

【ICIG2021】Latest News & Announcements of the Plenary Talk2

【ICIG2021】Latest News & Announcements of the Plenary Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年11月2日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Schr？dinger-Poisson方程守恒DDG方法研究

国家自然科学基金

2+阅读 · 2015年12月31日

原子层沉积法Cu-Ni合金催化剂的制备及在甲醇合成中的性能研究

国家自然科学基金

0+阅读 · 2013年12月31日

过渡金属掺杂的稀土氧化物/碳纳米复合材料的制备及电化学性能研究

国家自然科学基金

0+阅读 · 2013年12月31日

新型有机半导体材料的制备、组装及性能表征

国家自然科学基金

0+阅读 · 2012年12月31日

基于CS算法的数字信号压缩和高效数字系统设计的研究

国家自然科学基金

0+阅读 · 2012年12月31日

CPU Cache的功耗驱动设计方法及工具研究

国家自然科学基金

0+阅读 · 2012年12月31日

蒽醌/石墨烯纳米复合材料电极的电催化氧还原性能及其在异相electro-Fenton-like体系中的应用研究

国家自然科学基金

0+阅读 · 2011年12月31日

纳米金刚石基复合电催化剂载体材料的制备及性能研究

国家自然科学基金

0+阅读 · 2009年12月31日

金属配合物官能化LDH纳米层片多功能催化材料的制备及性能

国家自然科学基金

0+阅读 · 2009年12月31日

碳纳米管探针的制备及其结构优化的关键技术研究

国家自然科学基金

0+阅读 · 2009年12月31日

TWR-MCAE: A Data Augmentation Method for Through-the-Wall Radar Human Motion Recognition

Arxiv

0+阅读 · 2023年1月6日

Integrating Transformer and Autoencoder Techniques with Spectral Graph Algorithms for the Prediction of Scarcely Labeled Molecular Data

Arxiv

0+阅读 · 2023年1月5日

MS-DINO: Efficient Distributed Training of Vision Transformer Foundation Model in Medical Domain through Masked Sampling

Arxiv

0+阅读 · 2023年1月5日

G-CEALS: Gaussian Cluster Embedding in Autoencoder Latent Space for Tabular Data Representation

Arxiv

0+阅读 · 2023年1月5日

Text sampling strategies for predicting missing bibliographic links

Arxiv

0+阅读 · 2023年1月4日

Self-Supervised Learning via Maximum Entropy Coding

Arxiv

13+阅读 · 2022年10月20日

Mind Your Clever Neighbours: Unsupervised Person Re-identification via Adaptive Clustering Relationship Modeling

Arxiv

13+阅读 · 2021年12月3日

Representation Learning with Ordered Relation Paths for Knowledge Graph Completion

Representation Learning with Ordered Relation Paths for Knowledge Graph Completion

Arxiv

12+阅读 · 2019年9月26日

K-BERT: Enabling Language Representation with Knowledge Graph

K-BERT: Enabling Language Representation with Knowledge Graph

Arxiv

19+阅读 · 2019年9月17日

Phase-aware Speech Enhancement with Deep Complex U-Net

Phase-aware Speech Enhancement with Deep Complex U-Net

Arxiv

15+阅读 · 2019年3月7日

VIP会员

文章信息

相关主题

相关VIP内容

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

【深度学习表格检测、信息提取和结构化】《Table Detection, Information Extraction and Structuring using Deep Learning》by Vihar Kurama

专知会员服务

38+阅读 · 2020年1月23日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

ExBert — 可视化分析Transformer学到的表示

ExBert — 可视化分析Transformer学到的表示

专知会员服务

32+阅读 · 2019年10月16日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

大语言模型幻觉：系统综述

《分析与预测陆军战斗体能测试表现：统计与机器学习方法》2025最新137页

【博士论文】数据与任务的物理学：深度学习中的局部性与组合性理论

代理式人工智能时代的决策优势

相关资讯

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

【ICIG2021】Latest News & Announcements of the Plenary Talk2

【ICIG2021】Latest News & Announcements of the Plenary Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年11月2日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

TWR-MCAE: A Data Augmentation Method for Through-the-Wall Radar Human Motion Recognition

Arxiv

0+阅读 · 2023年1月6日

Integrating Transformer and Autoencoder Techniques with Spectral Graph Algorithms for the Prediction of Scarcely Labeled Molecular Data

Arxiv

0+阅读 · 2023年1月5日

MS-DINO: Efficient Distributed Training of Vision Transformer Foundation Model in Medical Domain through Masked Sampling

Arxiv

0+阅读 · 2023年1月5日

G-CEALS: Gaussian Cluster Embedding in Autoencoder Latent Space for Tabular Data Representation

Arxiv

0+阅读 · 2023年1月5日

Text sampling strategies for predicting missing bibliographic links

Arxiv

0+阅读 · 2023年1月4日

Self-Supervised Learning via Maximum Entropy Coding

Arxiv

13+阅读 · 2022年10月20日

Mind Your Clever Neighbours: Unsupervised Person Re-identification via Adaptive Clustering Relationship Modeling

Arxiv

13+阅读 · 2021年12月3日

Representation Learning with Ordered Relation Paths for Knowledge Graph Completion

Representation Learning with Ordered Relation Paths for Knowledge Graph Completion

Arxiv

12+阅读 · 2019年9月26日

K-BERT: Enabling Language Representation with Knowledge Graph

K-BERT: Enabling Language Representation with Knowledge Graph

Arxiv

19+阅读 · 2019年9月17日

Phase-aware Speech Enhancement with Deep Complex U-Net

Phase-aware Speech Enhancement with Deep Complex U-Net

Arxiv

15+阅读 · 2019年3月7日

相关基金

Schr？dinger-Poisson方程守恒DDG方法研究

国家自然科学基金

2+阅读 · 2015年12月31日

原子层沉积法Cu-Ni合金催化剂的制备及在甲醇合成中的性能研究

国家自然科学基金

0+阅读 · 2013年12月31日

过渡金属掺杂的稀土氧化物/碳纳米复合材料的制备及电化学性能研究

国家自然科学基金

0+阅读 · 2013年12月31日

新型有机半导体材料的制备、组装及性能表征

国家自然科学基金

0+阅读 · 2012年12月31日

基于CS算法的数字信号压缩和高效数字系统设计的研究

国家自然科学基金

0+阅读 · 2012年12月31日

CPU Cache的功耗驱动设计方法及工具研究

国家自然科学基金

0+阅读 · 2012年12月31日

蒽醌/石墨烯纳米复合材料电极的电催化氧还原性能及其在异相electro-Fenton-like体系中的应用研究

国家自然科学基金

0+阅读 · 2011年12月31日

纳米金刚石基复合电催化剂载体材料的制备及性能研究

国家自然科学基金

0+阅读 · 2009年12月31日

金属配合物官能化LDH纳米层片多功能催化材料的制备及性能

国家自然科学基金

0+阅读 · 2009年12月31日

碳纳米管探针的制备及其结构优化的关键技术研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员