Regotron: 通过单声调对齐损失, 使泰可通2 建筑结构正规化 (Regotron: Regularizing the Tacotron2 architecture via monotonic alignment loss) - 专知论文

会员服务 ·

0

正则化项 · 语音合成 · 可约的 · 损失 · Performer ·

2022 年 4 月 28 日

Regotron: Regularizing the Tacotron2 architecture via monotonic alignment loss

翻译：Regotron: 通过单声调对齐损失, 使泰可通2 建筑结构正规化

Efthymios Georgiou,Kosmas Kritsis,Georgios Paraskevopoulos,Athanasios Katsamanis,Vassilis Katsouros,Alexandros Potamianos

from arxiv, Submitted at InterSpeech 2022

Recent deep learning Text-to-Speech (TTS) systems have achieved impressive performance by generating speech close to human parity. However, they suffer from training stability issues as well as incorrect alignment of the intermediate acoustic representation with the input text sequence. In this work, we introduce Regotron, a regularized version of Tacotron2 which aims to alleviate the training issues and at the same time produce monotonic alignments. Our method augments the vanilla Tacotron2 objective function with an additional term, which penalizes non-monotonic alignments in the location-sensitive attention mechanism. By properly adjusting this regularization term we show that the loss curves become smoother, and at the same time Regotron consistently produces monotonic alignments in unseen examples even at an early stage (13\% of the total number of epochs) of its training process, whereas the fully converged Tacotron2 fails to do so. Moreover, our proposed regularization method has no additional computational overhead, while reducing common TTS mistakes and achieving slighlty improved speech naturalness according to subjective mean opinion scores (MOS) collected from 50 evaluators.

翻译：最近深入学习的文本到语音(TTS)系统取得了令人印象深刻的成绩,产生了接近人文均等的言论。但是,它们受到培训稳定性问题的影响,中间声学表达与输入文本序列的不正确调整。在这项工作中,我们引入了Regotron,这是一个常规版的Tacotron2, 目的是缓解培训问题, 同时产生单声调。我们的方法用一个额外的术语来增强Vanilla Tacotron2的目标功能,这惩罚了对位置敏感关注机制中的非声调一致。我们通过适当调整这个正规化术语,我们发现损失曲线变得更加平滑,同时Regotron甚至在培训过程的早期阶段( 占总教区总数13 % ), 也始终在无法在未见的事例中产生单声调, 而完全趋同的Tacotron2 却未能做到这一点。此外,我们提议的规范化方法没有额外的计算间接费用,同时减少常见的TTS错误,并实现slighlty的语音自然性,根据50评价员的主观平均意见分数(MOS) 。

0

相关内容

正则化项

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

开源书：PyTorch深度学习起步

开源书：PyTorch深度学习起步

专知会员服务

51+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

TensorFlow 2.0 学习资源汇总

TensorFlow 2.0 学习资源汇总

专知会员服务

67+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Zakharov系统的解的动力学行为研究

国家自然科学基金

0+阅读 · 2015年12月31日

单链saRNA加工和抑制效率的研究

国家自然科学基金

0+阅读 · 2014年12月31日

Poisson流形上的修正Hamilton方法

国家自然科学基金

0+阅读 · 2014年12月31日

Mumford-Shah型图像分割问题研究

国家自然科学基金

0+阅读 · 2013年12月31日

量子自旋格子系统的拓扑序、量子动力学和量子quench

国家自然科学基金

0+阅读 · 2012年12月31日

垃圾填埋场覆土甲烷(CH4)厌氧氧化及动力学研究

国家自然科学基金

0+阅读 · 2012年12月31日

TEM8-Fc抗肿瘤机理的研究

国家自然科学基金

0+阅读 · 2009年12月31日

与玻色-爱因斯坦凝聚相关的确定与不确定系统孤立子的动力学行为

国家自然科学基金

0+阅读 · 2009年12月31日

膜分离/多相Fenton-like催化氧化耦合系统的构建及其耦合特性研究

国家自然科学基金

0+阅读 · 2009年12月31日

全光学Feshbach共振操控超冷铯分子及光谱测量

国家自然科学基金

0+阅读 · 2009年12月31日

READ: Aggregating Reconstruction Error into Out-of-distribution Detection

Arxiv

0+阅读 · 2022年6月15日

The Dual PC Algorithm for Structure Learning

Arxiv

0+阅读 · 2022年6月15日

MonoGround: Detecting Monocular 3D Objects from the Ground

Arxiv

0+阅读 · 2022年6月15日

Task Specific Attention is one more thing you need for object detection

Arxiv

0+阅读 · 2022年6月15日

A software toolkit and hardware platform for investigating and comparing robot autonomy algorithms in simulation and reality

Arxiv

0+阅读 · 2022年6月14日

On the Computational Complexity of Metropolis-Adjusted Langevin Algorithms for Bayesian Posterior Sampling

Arxiv

0+阅读 · 2022年6月13日

Smooth Model Predictive Path Integral Control without Smoothing

Arxiv

0+阅读 · 2022年6月13日

Towards Model Generalization for Monocular 3D Object Detection

Arxiv

0+阅读 · 2022年6月13日

Neural Architecture Search without Training

Neural Architecture Search without Training

Arxiv

10+阅读 · 2021年6月11日

Automatically Designing CNN Architectures for Medical Image Segmentation

Automatically Designing CNN Architectures for Medical Image Segmentation

Arxiv

10+阅读 · 2018年7月19日

VIP会员

文章信息

相关主题

相关VIP内容

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

开源书：PyTorch深度学习起步

开源书：PyTorch深度学习起步

专知会员服务

51+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

TensorFlow 2.0 学习资源汇总

TensorFlow 2.0 学习资源汇总

专知会员服务

67+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《物联网（IoT）中的无人机通信高效控制》135页

《在GNSS信号降级环境中利用共识实现无人机集群稳健协调》

中程单向攻击无人机的战略意义：俄乌战争启示

《面向无人机集群的避障动态传感器覆盖算法》最新38页

相关资讯

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

READ: Aggregating Reconstruction Error into Out-of-distribution Detection

Arxiv

0+阅读 · 2022年6月15日

The Dual PC Algorithm for Structure Learning

Arxiv

0+阅读 · 2022年6月15日

MonoGround: Detecting Monocular 3D Objects from the Ground

Arxiv

0+阅读 · 2022年6月15日

Task Specific Attention is one more thing you need for object detection

Arxiv

0+阅读 · 2022年6月15日

A software toolkit and hardware platform for investigating and comparing robot autonomy algorithms in simulation and reality

Arxiv

0+阅读 · 2022年6月14日

On the Computational Complexity of Metropolis-Adjusted Langevin Algorithms for Bayesian Posterior Sampling

Arxiv

0+阅读 · 2022年6月13日

Smooth Model Predictive Path Integral Control without Smoothing

Arxiv

0+阅读 · 2022年6月13日

Towards Model Generalization for Monocular 3D Object Detection

Arxiv

0+阅读 · 2022年6月13日

Neural Architecture Search without Training

Neural Architecture Search without Training

Arxiv

10+阅读 · 2021年6月11日

Automatically Designing CNN Architectures for Medical Image Segmentation

Automatically Designing CNN Architectures for Medical Image Segmentation

Arxiv

10+阅读 · 2018年7月19日

相关基金

Zakharov系统的解的动力学行为研究

国家自然科学基金

0+阅读 · 2015年12月31日

单链saRNA加工和抑制效率的研究

国家自然科学基金

0+阅读 · 2014年12月31日

Poisson流形上的修正Hamilton方法

国家自然科学基金

0+阅读 · 2014年12月31日

Mumford-Shah型图像分割问题研究

国家自然科学基金

0+阅读 · 2013年12月31日

量子自旋格子系统的拓扑序、量子动力学和量子quench

国家自然科学基金

0+阅读 · 2012年12月31日

垃圾填埋场覆土甲烷(CH4)厌氧氧化及动力学研究

国家自然科学基金

0+阅读 · 2012年12月31日

TEM8-Fc抗肿瘤机理的研究

国家自然科学基金

0+阅读 · 2009年12月31日

与玻色-爱因斯坦凝聚相关的确定与不确定系统孤立子的动力学行为

国家自然科学基金

0+阅读 · 2009年12月31日

膜分离/多相Fenton-like催化氧化耦合系统的构建及其耦合特性研究

国家自然科学基金

0+阅读 · 2009年12月31日

全光学Feshbach共振操控超冷铯分子及光谱测量

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员