配有神经传感器的大型流散端对端语音翻译 (Large-Scale Streaming End-to-End Speech Translation with Neural Transducers) - 专知论文

会员服务 ·

0

可约的 · 语音翻译 · 流 · 语音识别 · 端到端 ·

2022 年 7 月 1 日

Large-Scale Streaming End-to-End Speech Translation with Neural Transducers

翻译：配有神经传感器的大型流散端对端语音翻译

Jian Xue,Peidong Wang,Jinyu Li,Matt Post,Yashesh Gaur

from arxiv, The paper was submitted to Interspeech 2022

Neural transducers have been widely used in automatic speech recognition (ASR). In this paper, we introduce it to streaming end-to-end speech translation (ST), which aims to convert audio signals to texts in other languages directly. Compared with cascaded ST that performs ASR followed by text-based machine translation (MT), the proposed Transformer transducer (TT)-based ST model drastically reduces inference latency, exploits speech information, and avoids error propagation from ASR to MT. To improve the modeling capacity, we propose attention pooling for the joint network in TT. In addition, we extend TT-based ST to multilingual ST, which generates texts of multiple languages at the same time. Experimental results on a large-scale 50 thousand (K) hours pseudo-labeled training set show that TT-based ST not only significantly reduces inference time but also outperforms non-streaming cascaded ST for English-German translation.

翻译：神经传感器被广泛用于自动语音识别( ASR ) 。在本文中, 我们将其引入到端到端语音翻译( ST) 流中, 目的是直接将音频信号转换为其他语言的文本。与进行自动识别( MAT) 的升级的自动识别( MAT) 相比, 拟议的基于变换器传感器( TT) 的ST 模型极大地减少了推导延迟时间, 利用语音信息, 并避免从 ASR 到MT 的错误传播。为了提高模拟能力, 我们建议关注在TT 中联合网络。此外, 我们将基于TT的ST 扩大到多语种ST, 后者同时生成多种语言文本。大规模5万( K)小时的假标签培训的实验结果显示, TT 的ST 不仅大大缩短了推导时间, 也超越了用于英语- 德语翻译的不流的自动递增ST 。

0

相关内容

可约的

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【AAAI2020】多模态注意力语义图嵌入多标签分类（Cross-Modality Attention with Semantic Graph Embedding for Multi-Label Classification）

【AAAI2020】多模态注意力语义图嵌入多标签分类（Cross-Modality Attention with Semantic Graph Embedding for Multi-Label Classification）

专知会员服务

92+阅读 · 2019年12月22日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

【ICIG2021】Latest News & Announcements of the Industry Talk2

【ICIG2021】Latest News & Announcements of the Industry Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年7月29日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【论文推荐】最新5篇图像描述生成（Image Caption）相关论文—情感、注意力机制、遥感图像、序列到序列、深度神经结构

【论文推荐】最新5篇图像描述生成（Image Caption）相关论文—情感、注意力机制、遥感图像、序列到序列、深度神经结构

专知

66+阅读 · 2018年1月31日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

茶树酚类物质生物合成分支途径调控的分子机理

国家自然科学基金

0+阅读 · 2014年12月31日

microRNA在缺血再灌注致急性肾损伤中的作用及机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

TR4翻译后修饰与宫内发育迟缓大鼠代谢综合征的易感机制

国家自然科学基金

0+阅读 · 2013年12月31日

Vlasov-Poisson-Boltzmann方程研究

国家自然科学基金

0+阅读 · 2013年12月31日

大黄附子汤对实验性重症急性胰腺炎肠黏膜上皮细胞线粒体损伤调控作用的研究

国家自然科学基金

0+阅读 · 2012年12月31日

灰飞虱卵巢内水稻条纹病毒增殖机制的研究

国家自然科学基金

0+阅读 · 2012年12月31日

Puma和Bim在慢性淋巴细胞白血病细胞凋亡中的作用机制研究

国家自然科学基金

0+阅读 · 2011年12月31日

基于视皮层感知机制的生物启发运动特征层次化模型

国家自然科学基金

0+阅读 · 2011年12月31日

盐生植物花花柴耐盐和抗氧化分子机理的研究

国家自然科学基金

0+阅读 · 2009年12月31日

针刺对脑缺血损伤和保护机制相关信号蛋白磷酸化作用研究

国家自然科学基金

0+阅读 · 2009年12月31日

UM4: Unified Multilingual Multiple Teacher-Student Model for Zero-Resource Neural Machine Translation

Arxiv

0+阅读 · 2022年8月24日

LWA-HAND: Lightweight Attention Hand for Interacting Hand Reconstruction

Arxiv

0+阅读 · 2022年8月23日

Selection Collider Bias in Large Language Models

Arxiv

0+阅读 · 2022年8月22日

Towards MOOCs for Lip Reading: Using Synthetic Talking Heads to Train Humans in Lipreading at Scale

Arxiv

0+阅读 · 2022年8月21日

Net2Brain: A Toolbox to compare artificial vision models with human brain responses

Arxiv

0+阅读 · 2022年8月20日

Aspect-based Sentiment Classification with Sequential Cross-modal Semantic Graph

Arxiv

0+阅读 · 2022年8月19日

Aspect-based Sentiment Classification with Aspect-specific Graph Convolutional Networks

Arxiv

11+阅读 · 2019年9月8日

Diverse Image-to-Image Translation via Disentangled Representations

Diverse Image-to-Image Translation via Disentangled Representations

Arxiv

13+阅读 · 2018年8月2日

Constructing Narrative Event Evolutionary Graph for Script Event Prediction

Arxiv

11+阅读 · 2018年5月16日

Link Prediction Based on Graph Neural Networks

Arxiv

26+阅读 · 2018年2月27日

VIP会员

文章信息

相关主题

相关VIP内容

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【AAAI2020】多模态注意力语义图嵌入多标签分类（Cross-Modality Attention with Semantic Graph Embedding for Multi-Label Classification）

【AAAI2020】多模态注意力语义图嵌入多标签分类（Cross-Modality Attention with Semantic Graph Embedding for Multi-Label Classification）

专知会员服务

92+阅读 · 2019年12月22日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【普林斯顿博士论文】在线学习：优化、控制与学习理论

不确定环境下无人机三维路径规划研究 | 221页

【NeurIPS2025】《LeapFactual：基于条件流匹配的可靠视觉反事实解释》

大语言模型将如何改变军事指挥结构

相关资讯

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

【ICIG2021】Latest News & Announcements of the Industry Talk2

【ICIG2021】Latest News & Announcements of the Industry Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年7月29日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【论文推荐】最新5篇图像描述生成（Image Caption）相关论文—情感、注意力机制、遥感图像、序列到序列、深度神经结构

【论文推荐】最新5篇图像描述生成（Image Caption）相关论文—情感、注意力机制、遥感图像、序列到序列、深度神经结构

专知

66+阅读 · 2018年1月31日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

相关论文

UM4: Unified Multilingual Multiple Teacher-Student Model for Zero-Resource Neural Machine Translation

Arxiv

0+阅读 · 2022年8月24日

LWA-HAND: Lightweight Attention Hand for Interacting Hand Reconstruction

Arxiv

0+阅读 · 2022年8月23日

Selection Collider Bias in Large Language Models

Arxiv

0+阅读 · 2022年8月22日

Towards MOOCs for Lip Reading: Using Synthetic Talking Heads to Train Humans in Lipreading at Scale

Arxiv

0+阅读 · 2022年8月21日

Net2Brain: A Toolbox to compare artificial vision models with human brain responses

Arxiv

0+阅读 · 2022年8月20日

Aspect-based Sentiment Classification with Sequential Cross-modal Semantic Graph

Arxiv

0+阅读 · 2022年8月19日

Aspect-based Sentiment Classification with Aspect-specific Graph Convolutional Networks

Arxiv

11+阅读 · 2019年9月8日

Diverse Image-to-Image Translation via Disentangled Representations

Diverse Image-to-Image Translation via Disentangled Representations

Arxiv

13+阅读 · 2018年8月2日

Constructing Narrative Event Evolutionary Graph for Script Event Prediction

Arxiv

11+阅读 · 2018年5月16日

Link Prediction Based on Graph Neural Networks

Arxiv

26+阅读 · 2018年2月27日

相关基金

茶树酚类物质生物合成分支途径调控的分子机理

国家自然科学基金

0+阅读 · 2014年12月31日

microRNA在缺血再灌注致急性肾损伤中的作用及机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

TR4翻译后修饰与宫内发育迟缓大鼠代谢综合征的易感机制

国家自然科学基金

0+阅读 · 2013年12月31日

Vlasov-Poisson-Boltzmann方程研究

国家自然科学基金

0+阅读 · 2013年12月31日

大黄附子汤对实验性重症急性胰腺炎肠黏膜上皮细胞线粒体损伤调控作用的研究

国家自然科学基金

0+阅读 · 2012年12月31日

灰飞虱卵巢内水稻条纹病毒增殖机制的研究

国家自然科学基金

0+阅读 · 2012年12月31日

Puma和Bim在慢性淋巴细胞白血病细胞凋亡中的作用机制研究

国家自然科学基金

0+阅读 · 2011年12月31日

基于视皮层感知机制的生物启发运动特征层次化模型

国家自然科学基金

0+阅读 · 2011年12月31日

盐生植物花花柴耐盐和抗氧化分子机理的研究

国家自然科学基金

0+阅读 · 2009年12月31日

针刺对脑缺血损伤和保护机制相关信号蛋白磷酸化作用研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员