通过无标签文本改进语音对语音翻译 (Improving Speech-to-Speech Translation Through Unlabeled Text) - 专知论文

会员服务 ·

0

未标记 · 语音合成 · 语音识别 · Machine Translation · Performer ·

2022 年 10 月 26 日

Improving Speech-to-Speech Translation Through Unlabeled Text

翻译：通过无标签文本改进语音对语音翻译

Xuan-Phi Nguyen,Sravya Popuri,Changhan Wang,Yun Tang,Ilia Kulikov,Hongyu Gong

Direct speech-to-speech translation (S2ST) is among the most challenging problems in the translation paradigm due to the significant scarcity of S2ST data. While effort has been made to increase the data size from unlabeled speech by cascading pretrained speech recognition (ASR), machine translation (MT) and text-to-speech (TTS) models; unlabeled text has remained relatively under-utilized to improve S2ST. We propose an effective way to utilize the massive existing unlabeled text from different languages to create a large amount of S2ST data to improve S2ST performance by applying various acoustic effects to the generated synthetic data. Empirically our method outperforms the state of the art in Spanish-English translation by up to 2 BLEU. Significant gains by the proposed method are demonstrated in extremely low-resource settings for both Spanish-English and Russian-English translations.

翻译：直接语音对语音翻译(S2ST)是翻译模式中最具挑战性的问题之一,因为S2ST数据严重缺乏。虽然已经作出努力,通过未经训练的语音识别(ASR)、机器翻译(MT)和文本对语音翻译(TTS)模式,增加无标签的语音发言的数据规模;没有标记的文本仍然相对利用不足,以改善S2ST。我们建议了一种有效的方法,利用来自不同语言的大量现有未标记文本,创造大量S2ST数据,通过对生成的合成数据应用各种声学效应,提高S2ST的性能。我们的方法在西班牙语英语翻译方面比西班牙语和俄语翻译的艺术水平高出了多达2个BLEU。拟议方法在极低的资源环境中展示了西班牙语-英语和俄语-英语翻译的巨大成果。

0

相关内容

未标记

NeurlPS 2022 | 自然语言处理相关论文分类整理

NeurlPS 2022 | 自然语言处理相关论文分类整理

专知会员服务

51+阅读 · 2022年10月2日

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

专知会员服务

44+阅读 · 2020年11月2日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

【微软研究院】IMAGEBERT: CROSS-MODAL PRE-TRAINING WITH LARGE-SCALE WEAK-SUPERVISED IMAGE-TEXT DATA

【微软研究院】IMAGEBERT: CROSS-MODAL PRE-TRAINING WITH LARGE-SCALE WEAK-SUPERVISED IMAGE-TEXT DATA

专知会员服务

43+阅读 · 2020年1月28日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

征稿 | CFP：Special Issue of NLP and KG(JCR Q2，IF2.67)

征稿 | CFP：Special Issue of NLP and KG(JCR Q2，IF2.67)

开放知识图谱

1+阅读 · 2022年4月4日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

ICLR2019最佳论文出炉

ICLR2019最佳论文出炉

专知

12+阅读 · 2019年5月6日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

离子注入合成In纳米颗粒在Al薄膜中超导性质的研究

国家自然科学基金

0+阅读 · 2015年12月31日

长链非编码RNA-VEC1340靶定KLF4在血管内皮细胞损伤中的调控及机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

基于二萜生物碱骨架特征以cortistatin A为先导化合物探索新型肿瘤血管生成抑制剂研究

国家自然科学基金

0+阅读 · 2013年12月31日

USPIO标记LIVIN反义寡脱氧核苷酸靶胰腺癌的磁共振分子成像研究

国家自然科学基金

0+阅读 · 2013年12月31日

外加应力及含水蒸气环境中CoNiCrAlY涂层表面氧化层的生长机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

NMDA受体与GABAa受体相互作用在水杨酸钠诱导的螺旋神经节神经元损害中的研究

国家自然科学基金

0+阅读 · 2012年12月31日

RA联合mTORC1抑制剂治疗结节性硬化症的实验研究

国家自然科学基金

0+阅读 · 2012年12月31日

DMT1介导血脑屏障铅转运的正反馈机制及铁的拮抗作用研究

国家自然科学基金

0+阅读 · 2009年12月31日

甜菜夜蛾嗅觉受体分子克隆及功能研究

国家自然科学基金

0+阅读 · 2008年12月31日

碳纳米管介电阻挡器件的电离与气敏特性

国家自然科学基金

0+阅读 · 2008年12月31日

Learning 3D Representations from 2D Pre-trained Models via Image-to-Point Masked Autoencoders

Arxiv

0+阅读 · 2022年12月13日

Privacy-Preserving Collaborative Learning through Feature Extraction

Arxiv

0+阅读 · 2022年12月13日

Direct Speech-to-speech Translation without Textual Annotation using Bottleneck Features

Arxiv

0+阅读 · 2022年12月12日

Ensembling Transformers for Cross-domain Automatic Term Extraction

Arxiv

0+阅读 · 2022年12月12日

Achieving Explainability for Plant Disease Classification with Disentangled Variational Autoencoders

Arxiv

0+阅读 · 2022年12月10日

Image-to-Image Translation: Methods and Applications

Arxiv

17+阅读 · 2021年1月21日

Adversarial Machine Learning in Image Classification: A Survey Towards the Defender's Perspective

Adversarial Machine Learning in Image Classification: A Survey Towards the Defender's Perspective

Arxiv

17+阅读 · 2020年9月8日

PEGASUS: Pre-training with Extracted Gap-sentences for Abstractive Summarization

Arxiv

17+阅读 · 2020年6月2日

Learning to Propagate Labels: Transductive Propagation Network for Few-shot Learning

Arxiv

21+阅读 · 2018年12月25日

Diverse Image-to-Image Translation via Disentangled Representations

Diverse Image-to-Image Translation via Disentangled Representations

Arxiv

13+阅读 · 2018年8月2日

VIP会员

文章信息

相关主题

Machine Translation

相关VIP内容

NeurlPS 2022 | 自然语言处理相关论文分类整理

NeurlPS 2022 | 自然语言处理相关论文分类整理

专知会员服务

51+阅读 · 2022年10月2日

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

专知会员服务

44+阅读 · 2020年11月2日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

【微软研究院】IMAGEBERT: CROSS-MODAL PRE-TRAINING WITH LARGE-SCALE WEAK-SUPERVISED IMAGE-TEXT DATA

【微软研究院】IMAGEBERT: CROSS-MODAL PRE-TRAINING WITH LARGE-SCALE WEAK-SUPERVISED IMAGE-TEXT DATA

专知会员服务

43+阅读 · 2020年1月28日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《关于俄乌战争的系列文章》2025最新70页

《军事行动中的人机AI编队本体模型》

更智能的人工智能实现更快速的电磁辐射控制（EMCON）

《俄罗斯常规军队能力现状及重建》2025最新124页

相关资讯

征稿 | CFP：Special Issue of NLP and KG(JCR Q2，IF2.67)

征稿 | CFP：Special Issue of NLP and KG(JCR Q2，IF2.67)

开放知识图谱

1+阅读 · 2022年4月4日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

ICLR2019最佳论文出炉

ICLR2019最佳论文出炉

专知

12+阅读 · 2019年5月6日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

相关论文

Learning 3D Representations from 2D Pre-trained Models via Image-to-Point Masked Autoencoders

Arxiv

0+阅读 · 2022年12月13日

Privacy-Preserving Collaborative Learning through Feature Extraction

Arxiv

0+阅读 · 2022年12月13日

Direct Speech-to-speech Translation without Textual Annotation using Bottleneck Features

Arxiv

0+阅读 · 2022年12月12日

Ensembling Transformers for Cross-domain Automatic Term Extraction

Arxiv

0+阅读 · 2022年12月12日

Achieving Explainability for Plant Disease Classification with Disentangled Variational Autoencoders

Arxiv

0+阅读 · 2022年12月10日

Image-to-Image Translation: Methods and Applications

Arxiv

17+阅读 · 2021年1月21日

Adversarial Machine Learning in Image Classification: A Survey Towards the Defender's Perspective

Adversarial Machine Learning in Image Classification: A Survey Towards the Defender's Perspective

Arxiv

17+阅读 · 2020年9月8日

PEGASUS: Pre-training with Extracted Gap-sentences for Abstractive Summarization

Arxiv

17+阅读 · 2020年6月2日

Learning to Propagate Labels: Transductive Propagation Network for Few-shot Learning

Arxiv

21+阅读 · 2018年12月25日

Diverse Image-to-Image Translation via Disentangled Representations

Diverse Image-to-Image Translation via Disentangled Representations

Arxiv

13+阅读 · 2018年8月2日

相关基金

离子注入合成In纳米颗粒在Al薄膜中超导性质的研究

国家自然科学基金

0+阅读 · 2015年12月31日

长链非编码RNA-VEC1340靶定KLF4在血管内皮细胞损伤中的调控及机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

基于二萜生物碱骨架特征以cortistatin A为先导化合物探索新型肿瘤血管生成抑制剂研究

国家自然科学基金

0+阅读 · 2013年12月31日

USPIO标记LIVIN反义寡脱氧核苷酸靶胰腺癌的磁共振分子成像研究

国家自然科学基金

0+阅读 · 2013年12月31日

外加应力及含水蒸气环境中CoNiCrAlY涂层表面氧化层的生长机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

NMDA受体与GABAa受体相互作用在水杨酸钠诱导的螺旋神经节神经元损害中的研究

国家自然科学基金

0+阅读 · 2012年12月31日

RA联合mTORC1抑制剂治疗结节性硬化症的实验研究

国家自然科学基金

0+阅读 · 2012年12月31日

DMT1介导血脑屏障铅转运的正反馈机制及铁的拮抗作用研究

国家自然科学基金

0+阅读 · 2009年12月31日

甜菜夜蛾嗅觉受体分子克隆及功能研究

国家自然科学基金

0+阅读 · 2008年12月31日

碳纳米管介电阻挡器件的电离与气敏特性

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员