数字爱因斯坦经验:快速语音到文字,供交流的AI (Digital Einstein Experience: Fast Text-to-Speech for Conversational AI) - 专知论文

会员服务 ·

0

FAST · 语音合成 · INTERACT · 音素 · AI ·

2021 年 7 月 21 日

Digital Einstein Experience: Fast Text-to-Speech for Conversational AI

翻译：数字爱因斯坦经验:快速语音到文字,供交流的AI

Joanna Rownicka,Kilian Sprenkamp,Antonio Tripiana,Volodymyr Gromoglasov,Timo P Kunz

from arxiv, accepted at Interspeech 2021

We describe our approach to create and deliver a custom voice for a conversational AI use-case. More specifically, we provide a voice for a Digital Einstein character, to enable human-computer interaction within the digital conversation experience. To create the voice which fits the context well, we first design a voice character and we produce the recordings which correspond to the desired speech attributes. We then model the voice. Our solution utilizes Fastspeech 2 for log-scaled mel-spectrogram prediction from phonemes and Parallel WaveGAN to generate the waveforms. The system supports a character input and gives a speech waveform at the output. We use a custom dictionary for selected words to ensure their proper pronunciation. Our proposed cloud architecture enables for fast voice delivery, making it possible to talk to the digital version of Albert Einstein in real-time.

翻译：我们描述我们为对话的 AI 使用大小写创建和提供自定义声音的方法。更具体地说, 我们为数字爱因斯坦字符提供一个声音, 以便在数字对话经历中实现人- 计算机互动。为了创建符合上下文的声音, 我们首先设计一个声音字符, 并制作符合想要的语音属性的录音。然后我们模拟这个声音。我们的解决方案使用快速语音 2 来从电话和平行WaveGAN 生成波形。系统支持一个字符输入, 并在输出时提供语音波形。我们使用一个选定词的自定义字典来确保其适当的发音。我们提议的云结构可以快速发送语音, 使得能够实时与阿尔伯特爱因斯坦的数字版本交谈。

0

相关内容

FAST

FAST：Conference on File and Storage Technologies。 Explanation：文件和存储技术会议。 Publisher：USENIX。 SIT:http://dblp.uni-trier.de/db/conf/fast/

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

专知会员服务

44+阅读 · 2020年11月2日

【Google-WWW2020】会话域探索的动态组合， Conversational Domain Exploration

专知会员服务

10+阅读 · 2020年3月22日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

【O'Reilly TensorFlow Conference 2019】TensorFlow，开源和IBM（TensorFlow, open source, and IBM ），IBM | Fred Reiss

【O'Reilly TensorFlow Conference 2019】TensorFlow，开源和IBM（TensorFlow, open source, and IBM ），IBM | Fred Reiss

专知会员服务

11+阅读 · 2019年11月14日

【O'Reilly TensorFlow Conference 2019】不要打败市场；击败机器人：金融对抗网络（Don’t beat the market; beat the bots: Adversarial networks in finance），Manceps机器学习架构师Garrett Lander、首席执行官兼首席顾问Al Kari

【O'Reilly TensorFlow Conference 2019】不要打败市场；击败机器人：金融对抗网络（Don’t beat the market; beat the bots: Adversarial networks in finance），Manceps机器学习架构师Garrett Lander、首席执行官兼首席顾问Al Kari

专知会员服务

16+阅读 · 2019年11月13日

【CoRL2019最佳论文】模仿学习，A Divergence Minimization Perspective on Imitation Learning Methods

【CoRL2019最佳论文】模仿学习，A Divergence Minimization Perspective on Imitation Learning Methods

专知会员服务

24+阅读 · 2019年11月11日

微软发布DialoGPT预训练语言模型，论文与代码 Large-Scale Generative Pre-training for Conversational Response Generation

微软发布DialoGPT预训练语言模型，论文与代码 Large-Scale Generative Pre-training for Conversational Response Generation

专知会员服务

28+阅读 · 2019年11月8日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

31+阅读 · 2019年10月17日

已删除

将门创投

12+阅读 · 2019年7月1日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

【论文推荐】最新5篇聊天机器人（Chatbot）相关论文—深度强化学习、社交聊天机器人小冰、对话聊天助手、序列-序列、动态词汇

【论文推荐】最新5篇聊天机器人（Chatbot）相关论文—深度强化学习、社交聊天机器人小冰、对话聊天助手、序列-序列、动态词汇

专知

23+阅读 · 2018年1月30日

推荐｜深度强化学习聊天机器人（附论文）！

推荐｜深度强化学习聊天机器人（附论文）！

全球人工智能

4+阅读 · 2018年1月30日

gan生成图像at 1024² 的代码论文

gan生成图像at 1024² 的代码论文

CreateAMind

4+阅读 · 2017年10月31日

Actionable Conversational Quality Indicators for Improving Task-Oriented Dialog Systems

Arxiv

0+阅读 · 2021年9月22日

Predict-then-Decide: A Predictive Approach for Wait or Answer Task in Dialogue Systems

Arxiv

0+阅读 · 2021年9月22日

Towards Topic-Guided Conversational Recommender System

Towards Topic-Guided Conversational Recommender System

Arxiv

4+阅读 · 2020年11月2日

WaveTTS: Tacotron-based TTS with Joint Time-Frequency Domain Loss

WaveTTS: Tacotron-based TTS with Joint Time-Frequency Domain Loss

Arxiv

3+阅读 · 2020年4月6日

AlignTTS: Efficient Feed-Forward Text-to-Speech System without Explicit Alignment

AlignTTS: Efficient Feed-Forward Text-to-Speech System without Explicit Alignment

Arxiv

3+阅读 · 2020年3月4日

Convolutional Self-Attention Network

Arxiv

6+阅读 · 2019年4月8日

The Design and Implementation of XiaoIce, an Empathetic Social Chatbot

The Design and Implementation of XiaoIce, an Empathetic Social Chatbot

Arxiv

4+阅读 · 2018年12月21日

Sample Efficient Adaptive Text-to-Speech

Arxiv

7+阅读 · 2018年9月27日

Neural Approaches to Conversational AI

Arxiv

26+阅读 · 2018年9月21日

End-to-end Speech Recognition with Word-based RNN Language Models

End-to-end Speech Recognition with Word-based RNN Language Models

Arxiv

3+阅读 · 2018年8月8日

VIP会员

文章信息

相关主题

相关VIP内容

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

专知会员服务

44+阅读 · 2020年11月2日

【Google-WWW2020】会话域探索的动态组合， Conversational Domain Exploration

专知会员服务

10+阅读 · 2020年3月22日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

【O'Reilly TensorFlow Conference 2019】TensorFlow，开源和IBM（TensorFlow, open source, and IBM ），IBM | Fred Reiss

【O'Reilly TensorFlow Conference 2019】TensorFlow，开源和IBM（TensorFlow, open source, and IBM ），IBM | Fred Reiss

专知会员服务

11+阅读 · 2019年11月14日

【O'Reilly TensorFlow Conference 2019】不要打败市场；击败机器人：金融对抗网络（Don’t beat the market; beat the bots: Adversarial networks in finance），Manceps机器学习架构师Garrett Lander、首席执行官兼首席顾问Al Kari

【O'Reilly TensorFlow Conference 2019】不要打败市场；击败机器人：金融对抗网络（Don’t beat the market; beat the bots: Adversarial networks in finance），Manceps机器学习架构师Garrett Lander、首席执行官兼首席顾问Al Kari

专知会员服务

16+阅读 · 2019年11月13日

【CoRL2019最佳论文】模仿学习，A Divergence Minimization Perspective on Imitation Learning Methods

【CoRL2019最佳论文】模仿学习，A Divergence Minimization Perspective on Imitation Learning Methods

专知会员服务

24+阅读 · 2019年11月11日

微软发布DialoGPT预训练语言模型，论文与代码 Large-Scale Generative Pre-training for Conversational Response Generation

微软发布DialoGPT预训练语言模型，论文与代码 Large-Scale Generative Pre-training for Conversational Response Generation

专知会员服务

28+阅读 · 2019年11月8日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

31+阅读 · 2019年10月17日

热门VIP内容

开通专知VIP会员享更多权益服务

NeurIPS 2025 | 自动化所新作速览（一）

大型语言模型（LLM）赋能的知识图谱构建：综述

NeurIPS 2025 | 自动化所新作速览（二）

领域特定文本分类中的预训练语言模型新进展：系统综述

相关资讯

已删除

将门创投

12+阅读 · 2019年7月1日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

【论文推荐】最新5篇聊天机器人（Chatbot）相关论文—深度强化学习、社交聊天机器人小冰、对话聊天助手、序列-序列、动态词汇

【论文推荐】最新5篇聊天机器人（Chatbot）相关论文—深度强化学习、社交聊天机器人小冰、对话聊天助手、序列-序列、动态词汇

专知

23+阅读 · 2018年1月30日

推荐｜深度强化学习聊天机器人（附论文）！

推荐｜深度强化学习聊天机器人（附论文）！

全球人工智能

4+阅读 · 2018年1月30日

gan生成图像at 1024² 的代码论文

gan生成图像at 1024² 的代码论文

CreateAMind

4+阅读 · 2017年10月31日

相关论文

Actionable Conversational Quality Indicators for Improving Task-Oriented Dialog Systems

Arxiv

0+阅读 · 2021年9月22日

Predict-then-Decide: A Predictive Approach for Wait or Answer Task in Dialogue Systems

Arxiv

0+阅读 · 2021年9月22日

Towards Topic-Guided Conversational Recommender System

Towards Topic-Guided Conversational Recommender System

Arxiv

4+阅读 · 2020年11月2日

WaveTTS: Tacotron-based TTS with Joint Time-Frequency Domain Loss

WaveTTS: Tacotron-based TTS with Joint Time-Frequency Domain Loss

Arxiv

3+阅读 · 2020年4月6日

AlignTTS: Efficient Feed-Forward Text-to-Speech System without Explicit Alignment

AlignTTS: Efficient Feed-Forward Text-to-Speech System without Explicit Alignment

Arxiv

3+阅读 · 2020年3月4日

Convolutional Self-Attention Network

Arxiv

6+阅读 · 2019年4月8日

The Design and Implementation of XiaoIce, an Empathetic Social Chatbot

The Design and Implementation of XiaoIce, an Empathetic Social Chatbot

Arxiv

4+阅读 · 2018年12月21日

Sample Efficient Adaptive Text-to-Speech

Arxiv

7+阅读 · 2018年9月27日

Neural Approaches to Conversational AI

Arxiv

26+阅读 · 2018年9月21日

End-to-end Speech Recognition with Word-based RNN Language Models

End-to-end Speech Recognition with Word-based RNN Language Models

Arxiv

3+阅读 · 2018年8月8日

微信扫码咨询专知VIP会员