将阿拉伯语翻译为英语广播新闻 (End-to-End Speech Translation of Arabic to English Broadcast News) - 专知论文

会员服务 ·

0

语音翻译 · 端到端 · 语音识别 · 机器翻译 · 可辨认的 ·

2022 年 12 月 11 日

End-to-End Speech Translation of Arabic to English Broadcast News

翻译：将阿拉伯语翻译为英语广播新闻

Fethi Bougares,Salim Jouili

from arxiv, Arabic Natural Language Processing Workshop 2022

Speech translation (ST) is the task of directly translating acoustic speech signals in a source language into text in a foreign language. ST task has been addressed, for a long time, using a pipeline approach with two modules : first an Automatic Speech Recognition (ASR) in the source language followed by a text-to-text Machine translation (MT). In the past few years, we have seen a paradigm shift towards the end-to-end approaches using sequence-to-sequence deep neural network models. This paper presents our efforts towards the development of the first Broadcast News end-to-end Arabic to English speech translation system. Starting from independent ASR and MT LDC releases, we were able to identify about 92 hours of Arabic audio recordings for which the manual transcription was also translated into English at the segment level. These data was used to train and compare pipeline and end-to-end speech translation systems under multiple scenarios including transfer learning and data augmentation techniques.

翻译：语音翻译(ST)是将源语言的语音信号直接翻译成外语文本的任务。长期以来,ST任务一直采用两个模块的编审方式处理。第一个模块是源语言的自动语音识别(ASR),然后是文本到文本的机器翻译(MT),过去几年,我们看到了使用序列到序列的深层神经网络模型向终端到终端方法的范式转变。本文介绍了我们为开发第一个广播新闻端到端阿拉伯语到英语语音翻译系统所做的努力。从独立的ASR和MT最不发达国家发布开始,我们得以确定大约92小时的阿拉伯语录音记录,其人工抄录在分段一级也翻译成英文。这些数据被用于培训和比较在多种情况下的管道和端到终端语音翻译系统,包括传输学习和数据增强技术。

0

相关内容

语音翻译

通过计算机进行不同语言之间的直接语音翻译，辅助不同语言背景的人们进行沟通已经成为世界各国研究的重点。和一般的文本翻译不同，语音翻译需要把语音识别、机器翻译和语音合成三大技术进行集成，具有很大的挑战性。

NeurlPS 2022 | 自然语言处理相关论文分类整理

NeurlPS 2022 | 自然语言处理相关论文分类整理

专知会员服务

51+阅读 · 2022年10月2日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Plenary Talk2

【ICIG2021】Latest News & Announcements of the Plenary Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年11月2日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

【ICIG2021】Latest News & Announcements of the Industry Talk2

【ICIG2021】Latest News & Announcements of the Industry Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年7月29日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

针刺改善自发性高血压大鼠心肌肥厚及舒缩功能损害的机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

LSPR半导体-稀土掺杂氟化物纳米复合材料的可控制备与上转换发光增强的研究

国家自然科学基金

0+阅读 · 2014年12月31日

PARP1调控BRD7稳定性的机制及其在乳腺癌中的意义

国家自然科学基金

0+阅读 · 2013年12月31日

泛素介导的蛋白质降解在心肌肥厚中的作用及机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

AMPKα1调节介导Ca2+内流对高糖诱导内皮细胞调亡的影响及机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

蛋白激酶LIMK1活性在小鼠卵母细胞染色体分离过程中的作用和分子机制

国家自然科学基金

0+阅读 · 2012年12月31日

microRNA调节肿瘤抑制因子Caliban应答DNA损伤的机制

国家自然科学基金

1+阅读 · 2012年12月31日

多酸基石墨烯多孔复合纳米材料的可控制备与性能

国家自然科学基金

0+阅读 · 2011年12月31日

Sonazoid肝脏超声造影诊断肝硬化的实验研究

国家自然科学基金

0+阅读 · 2009年12月31日

铽、铈激活含Ba(Gd,Y)F5纳米晶闪烁微晶玻璃的制备和发光机理研究

国家自然科学基金

0+阅读 · 2009年12月31日

SkCoder: A Sketch-based Approach for Automatic Code Generation

Arxiv

0+阅读 · 2023年2月13日

An efficient encoder-decoder architecture with top-down attention for speech separation

Arxiv

0+阅读 · 2023年2月11日

Machine Learning Based Approach to Recommend MITRE ATT&CK Framework for Software Requirements and Design Specifications

Arxiv

0+阅读 · 2023年2月10日

Building cross-language corpora for human understanding of privacy policies

Arxiv

0+阅读 · 2023年2月10日

Language-Aware Multilingual Machine Translation with Self-Supervised Learning

Arxiv

0+阅读 · 2023年2月10日

Knowledge is a Region in Weight Space for Fine-tuned Language Models

Arxiv

0+阅读 · 2023年2月9日

A Survey of Knowledge-Enhanced Text Generation

Arxiv

18+阅读 · 2020年10月9日

Image Captioning at Will: A Versatile Scheme for Effectively Injecting Sentiments into Image Descriptions

Arxiv

16+阅读 · 2018年1月30日

DKN: Deep Knowledge-Aware Network for News Recommendation

Arxiv

22+阅读 · 2018年1月30日

Image Captioning using Deep Neural Architectures

Arxiv

20+阅读 · 2018年1月17日

VIP会员

文章信息

相关主题

相关VIP内容

NeurlPS 2022 | 自然语言处理相关论文分类整理

NeurlPS 2022 | 自然语言处理相关论文分类整理

专知会员服务

51+阅读 · 2022年10月2日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

面向性能、成本效益、云边隐私与可信性的大小语言模型协作综述

乌克兰太空研究（2022-2024年） | 176页

【CMU博士论文】大型语言模型的隐性特性

国防领域人工智能走向何方？

相关资讯

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Plenary Talk2

【ICIG2021】Latest News & Announcements of the Plenary Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年11月2日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

【ICIG2021】Latest News & Announcements of the Industry Talk2

【ICIG2021】Latest News & Announcements of the Industry Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年7月29日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

相关论文

SkCoder: A Sketch-based Approach for Automatic Code Generation

Arxiv

0+阅读 · 2023年2月13日

An efficient encoder-decoder architecture with top-down attention for speech separation

Arxiv

0+阅读 · 2023年2月11日

Machine Learning Based Approach to Recommend MITRE ATT&CK Framework for Software Requirements and Design Specifications

Arxiv

0+阅读 · 2023年2月10日

Building cross-language corpora for human understanding of privacy policies

Arxiv

0+阅读 · 2023年2月10日

Language-Aware Multilingual Machine Translation with Self-Supervised Learning

Arxiv

0+阅读 · 2023年2月10日

Knowledge is a Region in Weight Space for Fine-tuned Language Models

Arxiv

0+阅读 · 2023年2月9日

A Survey of Knowledge-Enhanced Text Generation

Arxiv

18+阅读 · 2020年10月9日

Image Captioning at Will: A Versatile Scheme for Effectively Injecting Sentiments into Image Descriptions

Arxiv

16+阅读 · 2018年1月30日

DKN: Deep Knowledge-Aware Network for News Recommendation

Arxiv

22+阅读 · 2018年1月30日

Image Captioning using Deep Neural Architectures

Arxiv

20+阅读 · 2018年1月17日

相关基金

针刺改善自发性高血压大鼠心肌肥厚及舒缩功能损害的机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

LSPR半导体-稀土掺杂氟化物纳米复合材料的可控制备与上转换发光增强的研究

国家自然科学基金

0+阅读 · 2014年12月31日

PARP1调控BRD7稳定性的机制及其在乳腺癌中的意义

国家自然科学基金

0+阅读 · 2013年12月31日

泛素介导的蛋白质降解在心肌肥厚中的作用及机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

AMPKα1调节介导Ca2+内流对高糖诱导内皮细胞调亡的影响及机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

蛋白激酶LIMK1活性在小鼠卵母细胞染色体分离过程中的作用和分子机制

国家自然科学基金

0+阅读 · 2012年12月31日

microRNA调节肿瘤抑制因子Caliban应答DNA损伤的机制

国家自然科学基金

1+阅读 · 2012年12月31日

多酸基石墨烯多孔复合纳米材料的可控制备与性能

国家自然科学基金

0+阅读 · 2011年12月31日

Sonazoid肝脏超声造影诊断肝硬化的实验研究

国家自然科学基金

0+阅读 · 2009年12月31日

铽、铈激活含Ba(Gd,Y)F5纳米晶闪烁微晶玻璃的制备和发光机理研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员