以日本文本到语音合成中双向编码器表示的文字断裂预测 (Phrase break prediction with bidirectional encoder representations in Japanese text-to-speech synthesis) - 专知论文

会员服务 ·

0

双向LSTM · 语音合成 · 特征提取 · 得分 · 语言模型化 ·

2021 年 4 月 26 日

Phrase break prediction with bidirectional encoder representations in Japanese text-to-speech synthesis

翻译：以日本文本到语音合成中双向编码器表示的文字断裂预测

Kosuke Futamata,Byeongseon Park,Ryuichi Yamamoto,Kentaro Tachibana

from arxiv, Submitted to INTERSPEECH 2021

We propose a novel phrase break prediction method that combines implicit features extracted from a pre-trained large language model, a.k.a BERT, and explicit features extracted from BiLSTM with linguistic features. In conventional BiLSTM based methods, word representations and/or sentence representations are used as independent components. The proposed method takes account of both representations to extract the latent semantics, which cannot be captured by previous methods. The objective evaluation results show that the proposed method obtains an absolute improvement of 3.2 points for the F1 score compared with BiLSTM-based conventional methods using linguistic features. Moreover, the perceptual listening test results verify that a TTS system that applied our proposed method achieved a mean opinion score of 4.39 in prosody naturalness, which is highly competitive with the score of 4.37 for synthesized speech with ground-truth phrase breaks.

翻译：我们建议一种新型的短语断裂预测方法,结合从预先训练过的大型语言模型(a.k.a.BERT)中提取的隐含特征和从BILSTM中提取的具有语言特征的清晰特征。在传统的BILSTM方法中,单词表达和/或句表述是作为独立的组成部分使用。拟议方法考虑到两种表达方式,以提取以前方法无法捕捉的潜在语义。客观评价结果显示,拟议方法F1分的绝对改善3.2分,而使用语言特征的BILSTM常规方法则改进了3.2分。此外,概念性倾听测试结果证实,采用我们拟议方法的TTS系统在行走自然性方面达到了4.39分的平均评分,这与4.37分的合成语义和地面真实语断裂非常有竞争力。

0

相关内容

双向LSTM

BiLSTM是Bi-directional Long Short-Term Memory的缩写，是由前向LSTM与后向LSTM组合而成。在自然语言处理任务中都常被用来建模上下文信息。

【Facebook AI】无监督机器翻译，336页ppt，Unsupervised Machine Translation

专知会员服务

19+阅读 · 2020年11月17日

手写实现李航《统计学习方法》书中全部算法

手写实现李航《统计学习方法》书中全部算法

专知会员服务

49+阅读 · 2020年8月2日

【Google】无监督机器翻译，Unsupervised Machine Translation

【Google】无监督机器翻译，Unsupervised Machine Translation

专知会员服务

36+阅读 · 2020年3月3日

【中科院自动化所】序列到序列语音识别的无监督预训练（Unsupervised pre-training for sequence to sequence speech recognition）

【中科院自动化所】序列到序列语音识别的无监督预训练（Unsupervised pre-training for sequence to sequence speech recognition）

专知会员服务

33+阅读 · 2020年1月5日

【清华大学】Bert 简介，Bidirectional Encoder Representations from Transformers，21页ppt

【清华大学】Bert 简介，Bidirectional Encoder Representations from Transformers，21页ppt

专知会员服务

79+阅读 · 2019年12月29日

【华盛顿大学】知识建模+生成式推理，60页ppt，Cracking Commonsense Intelligence with Knowledge Modeling + Generative Reasoning

【华盛顿大学】知识建模+生成式推理，60页ppt，Cracking Commonsense Intelligence with Knowledge Modeling + Generative Reasoning

专知会员服务

54+阅读 · 2019年12月27日

【NLP| 推荐文章】语言语音处理（Speech and Language Processing(3rd ed.draft)）

专知会员服务

15+阅读 · 2019年11月24日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

ExBert — 可视化分析Transformer学到的表示

ExBert — 可视化分析Transformer学到的表示

专知会员服务

32+阅读 · 2019年10月16日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

53+阅读 · 2019年9月29日

【资源】语音增强资源集锦

【资源】语音增强资源集锦

专知

8+阅读 · 2020年7月4日

LibRec 精选：AutoML for Contextual Bandits

LibRec 精选：AutoML for Contextual Bandits

LibRec智能推荐

7+阅读 · 2019年9月19日

Successor representations 强化学习表示的生物学启发

Successor representations 强化学习表示的生物学启发

CreateAMind

6+阅读 · 2019年9月5日

Facebook开源增强版LASER库，包含93种语言工具包

Facebook开源增强版LASER库，包含93种语言工具包

机器之心

5+阅读 · 2019年1月23日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Facebook PyText 在 Github 上开源了

Facebook PyText 在 Github 上开源了

AINLP

7+阅读 · 2018年12月14日

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

AINLP

35+阅读 · 2018年11月6日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【论文推荐】最新四篇CVPR2018 视频描述生成相关论文—双向注意力、Transformer、重构网络、层次强化学习

【论文推荐】最新四篇CVPR2018 视频描述生成相关论文—双向注意力、Transformer、重构网络、层次强化学习

专知

31+阅读 · 2018年6月4日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

Question Answering Infused Pre-training of General-Purpose Contextualized Representations

Arxiv

0+阅读 · 2021年6月15日

HuBERT: Self-Supervised Speech Representation Learning by Masked Prediction of Hidden Units

HuBERT: Self-Supervised Speech Representation Learning by Masked Prediction of Hidden Units

Arxiv

2+阅读 · 2021年6月14日

Exploiting Sentence-Level Representations for Passage Ranking

Arxiv

0+阅读 · 2021年6月14日

S2VC: A Framework for Any-to-Any Voice Conversion with Self-Supervised Pretrained Representations

Arxiv

0+阅读 · 2021年6月14日

Implicit Representations of Meaning in Neural Language Models

Arxiv

0+阅读 · 2021年6月1日

Text Summarization with Pretrained Encoders

Arxiv

5+阅读 · 2019年8月22日

Phrase-Based & Neural Unsupervised Machine Translation

Phrase-Based & Neural Unsupervised Machine Translation

Arxiv

9+阅读 · 2018年8月13日

Recursive Neural Network Based Preordering for English-to-Japanese Machine Translation

Arxiv

7+阅读 · 2018年5月25日

Deep contextualized word representations

Arxiv

10+阅读 · 2018年3月22日

What Level of Quality can Neural Machine Translation Attain on Literary Text?

Arxiv

5+阅读 · 2018年1月15日

VIP会员

文章信息

相关主题

语言模型化

相关VIP内容

【Facebook AI】无监督机器翻译，336页ppt，Unsupervised Machine Translation

专知会员服务

19+阅读 · 2020年11月17日

手写实现李航《统计学习方法》书中全部算法

手写实现李航《统计学习方法》书中全部算法

专知会员服务

49+阅读 · 2020年8月2日

【Google】无监督机器翻译，Unsupervised Machine Translation

【Google】无监督机器翻译，Unsupervised Machine Translation

专知会员服务

36+阅读 · 2020年3月3日

【中科院自动化所】序列到序列语音识别的无监督预训练（Unsupervised pre-training for sequence to sequence speech recognition）

【中科院自动化所】序列到序列语音识别的无监督预训练（Unsupervised pre-training for sequence to sequence speech recognition）

专知会员服务

33+阅读 · 2020年1月5日

【清华大学】Bert 简介，Bidirectional Encoder Representations from Transformers，21页ppt

【清华大学】Bert 简介，Bidirectional Encoder Representations from Transformers，21页ppt

专知会员服务

79+阅读 · 2019年12月29日

【华盛顿大学】知识建模+生成式推理，60页ppt，Cracking Commonsense Intelligence with Knowledge Modeling + Generative Reasoning

【华盛顿大学】知识建模+生成式推理，60页ppt，Cracking Commonsense Intelligence with Knowledge Modeling + Generative Reasoning

专知会员服务

54+阅读 · 2019年12月27日

【NLP| 推荐文章】语言语音处理（Speech and Language Processing(3rd ed.draft)）

专知会员服务

15+阅读 · 2019年11月24日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

ExBert — 可视化分析Transformer学到的表示

ExBert — 可视化分析Transformer学到的表示

专知会员服务

32+阅读 · 2019年10月16日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

53+阅读 · 2019年9月29日

热门VIP内容

开通专知VIP会员享更多权益服务

【NTU博士论文】深度神经网络的参数高效推理与训练

人工智能：实时战斗适应

【NeurIPS2025】MIDAS：一种基于错配的用于失衡多模态学习的数据增强策略

从感知到认知：多模态大语言模型中视觉-语言交互推理综述

相关资讯

【资源】语音增强资源集锦

【资源】语音增强资源集锦

专知

8+阅读 · 2020年7月4日

LibRec 精选：AutoML for Contextual Bandits

LibRec 精选：AutoML for Contextual Bandits

LibRec智能推荐

7+阅读 · 2019年9月19日

Successor representations 强化学习表示的生物学启发

Successor representations 强化学习表示的生物学启发

CreateAMind

6+阅读 · 2019年9月5日

Facebook开源增强版LASER库，包含93种语言工具包

Facebook开源增强版LASER库，包含93种语言工具包

机器之心

5+阅读 · 2019年1月23日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Facebook PyText 在 Github 上开源了

Facebook PyText 在 Github 上开源了

AINLP

7+阅读 · 2018年12月14日

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

AINLP

35+阅读 · 2018年11月6日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【论文推荐】最新四篇CVPR2018 视频描述生成相关论文—双向注意力、Transformer、重构网络、层次强化学习

【论文推荐】最新四篇CVPR2018 视频描述生成相关论文—双向注意力、Transformer、重构网络、层次强化学习

专知

31+阅读 · 2018年6月4日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

相关论文

Question Answering Infused Pre-training of General-Purpose Contextualized Representations

Arxiv

0+阅读 · 2021年6月15日

HuBERT: Self-Supervised Speech Representation Learning by Masked Prediction of Hidden Units

HuBERT: Self-Supervised Speech Representation Learning by Masked Prediction of Hidden Units

Arxiv

2+阅读 · 2021年6月14日

Exploiting Sentence-Level Representations for Passage Ranking

Arxiv

0+阅读 · 2021年6月14日

S2VC: A Framework for Any-to-Any Voice Conversion with Self-Supervised Pretrained Representations

Arxiv

0+阅读 · 2021年6月14日

Implicit Representations of Meaning in Neural Language Models

Arxiv

0+阅读 · 2021年6月1日

Text Summarization with Pretrained Encoders

Arxiv

5+阅读 · 2019年8月22日

Phrase-Based & Neural Unsupervised Machine Translation

Phrase-Based & Neural Unsupervised Machine Translation

Arxiv

9+阅读 · 2018年8月13日

Recursive Neural Network Based Preordering for English-to-Japanese Machine Translation

Arxiv

7+阅读 · 2018年5月25日

Deep contextualized word representations

Arxiv

10+阅读 · 2018年3月22日

What Level of Quality can Neural Machine Translation Attain on Literary Text?

Arxiv

5+阅读 · 2018年1月15日

微信扫码咨询专知VIP会员