不要丢弃语音对文字翻译中的固定窗口音频分割 (Don't Discard Fixed-Window Audio Segmentation in Speech-to-Text Translation) - 专知论文

会员服务 ·

0

Performer · MoDELS · Continuity · 在线 · SimPLe ·

2022 年 10 月 24 日

Don't Discard Fixed-Window Audio Segmentation in Speech-to-Text Translation

翻译：不要丢弃语音对文字翻译中的固定窗口音频分割

Chantal Amrhein,Barry Haddow

from arxiv, accepted to WMT22

For real-life applications, it is crucial that end-to-end spoken language translation models perform well on continuous audio, without relying on human-supplied segmentation. For online spoken language translation, where models need to start translating before the full utterance is spoken, most previous work has ignored the segmentation problem. In this paper, we compare various methods for improving models' robustness towards segmentation errors and different segmentation strategies in both offline and online settings and report results on translation quality, flicker and delay. Our findings on five different language pairs show that a simple fixed-window audio segmentation can perform surprisingly well given the right conditions.

翻译：对于实际应用而言,关键是端对端口语翻译模式在连续的音频上运行良好,而不必依赖人供应的分化。对于在线口语翻译,在全语发音之前需要开始翻译的模式,大多数先前的工作忽视了分化问题。在本文中,我们比较了各种方法,以提高模型对分化错误和离线和在线设置中不同分化战略的稳健性,并报告了翻译质量、闪烁和延迟的结果。我们对五对不同语言的研究结果显示,如果条件合适,简单的固定式音频分化可以发挥出人意料的很好的效果。

0

相关内容

Performer

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

计算机科学课程与视频课件合集，Computer Science courses with video lectures

计算机科学课程与视频课件合集，Computer Science courses with video lectures

专知会员服务

37+阅读 · 2022年1月24日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

93+阅读 · 2020年2月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Latest News & Announcements of the Plenary Talk2

【ICIG2021】Latest News & Announcements of the Plenary Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年11月2日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

【ICIG2021】Latest News & Announcements of the Industry Talk2

【ICIG2021】Latest News & Announcements of the Industry Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年7月29日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

半线性广义Tricomi方程Cauchy问题解的生命跨度估计研究

国家自然科学基金

0+阅读 · 2017年12月31日

近Kenmotsu流形的曲率与Ricci孤立子

国家自然科学基金

0+阅读 · 2015年12月31日

基于核酸适配体-多功能胶体晶体编码微球液相芯片多元检测谷物中真菌毒素新体系研究

国家自然科学基金

0+阅读 · 2014年12月31日

压缩感知与稀疏信号恢复

国家自然科学基金

2+阅读 · 2014年12月31日

稀疏性在数字几何处理中的应用研究

国家自然科学基金

0+阅读 · 2013年12月31日

燃烧流和交通流的初边值问题

国家自然科学基金

0+阅读 · 2013年12月31日

镧系元素掺杂BiFeO3薄膜铁电与压电各向异性研究

国家自然科学基金

0+阅读 · 2011年12月31日

渗流及相关随机系统的极限行为研究

国家自然科学基金

0+阅读 · 2009年12月31日

遍历哈密顿系统的谱理论

国家自然科学基金

0+阅读 · 2009年12月31日

利用GPS与IM/WS干涉测量监测鲜水河断层变形

国家自然科学基金

0+阅读 · 2008年12月31日

Editing Models with Task Arithmetic

Arxiv

0+阅读 · 2022年12月8日

GD-MAE: Generative Decoder for MAE Pre-training on LiDAR Point Clouds

Arxiv

0+阅读 · 2022年12月7日

Learn to Explore: on Bootstrapping Interactive Data Exploration with Meta-learning

Arxiv

0+阅读 · 2022年12月7日

On the role of benchmarking data sets and simulations in method comparison studies

Arxiv

0+阅读 · 2022年12月6日

Ultra-high Resolution Image Segmentation via Locality-aware Context Fusion and Alternating Local Enhancement

Arxiv

0+阅读 · 2022年12月6日

Adversarial and Contrastive Variational Autoencoder for Sequential Recommendation

Arxiv

17+阅读 · 2021年3月19日

Unifying Vision-and-Language Tasks via Text Generation

Arxiv

10+阅读 · 2021年2月4日

Text Detection and Recognition in the Wild: A Review

Arxiv

20+阅读 · 2020年6月8日

Multilingual Sentiment Analysis: An RNN-Based Framework for Limited Data

Arxiv

12+阅读 · 2018年6月8日

Reinforced Self-Attention Network: a Hybrid of Hard and Soft Attention for Sequence Modeling

Arxiv

16+阅读 · 2018年1月31日

VIP会员

文章信息

相关主题

相关VIP内容

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

计算机科学课程与视频课件合集，Computer Science courses with video lectures

计算机科学课程与视频课件合集，Computer Science courses with video lectures

专知会员服务

37+阅读 · 2022年1月24日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

93+阅读 · 2020年2月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

视觉-语言-动作模型解析：从模块构成到里程碑与挑战

《解析陆域作战方向：一个概念性框架》报告

【博士论文】基于多模态基础模型的上下文学习

追寻真正的AI自主性：从遗留思维到战场优势

相关资讯

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Latest News & Announcements of the Plenary Talk2

【ICIG2021】Latest News & Announcements of the Plenary Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年11月2日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

【ICIG2021】Latest News & Announcements of the Industry Talk2

【ICIG2021】Latest News & Announcements of the Industry Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年7月29日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

相关论文

Editing Models with Task Arithmetic

Arxiv

0+阅读 · 2022年12月8日

GD-MAE: Generative Decoder for MAE Pre-training on LiDAR Point Clouds

Arxiv

0+阅读 · 2022年12月7日

Learn to Explore: on Bootstrapping Interactive Data Exploration with Meta-learning

Arxiv

0+阅读 · 2022年12月7日

On the role of benchmarking data sets and simulations in method comparison studies

Arxiv

0+阅读 · 2022年12月6日

Ultra-high Resolution Image Segmentation via Locality-aware Context Fusion and Alternating Local Enhancement

Arxiv

0+阅读 · 2022年12月6日

Adversarial and Contrastive Variational Autoencoder for Sequential Recommendation

Arxiv

17+阅读 · 2021年3月19日

Unifying Vision-and-Language Tasks via Text Generation

Arxiv

10+阅读 · 2021年2月4日

Text Detection and Recognition in the Wild: A Review

Arxiv

20+阅读 · 2020年6月8日

Multilingual Sentiment Analysis: An RNN-Based Framework for Limited Data

Arxiv

12+阅读 · 2018年6月8日

Reinforced Self-Attention Network: a Hybrid of Hard and Soft Attention for Sequence Modeling

Arxiv

16+阅读 · 2018年1月31日

相关基金

半线性广义Tricomi方程Cauchy问题解的生命跨度估计研究

国家自然科学基金

0+阅读 · 2017年12月31日

近Kenmotsu流形的曲率与Ricci孤立子

国家自然科学基金

0+阅读 · 2015年12月31日

基于核酸适配体-多功能胶体晶体编码微球液相芯片多元检测谷物中真菌毒素新体系研究

国家自然科学基金

0+阅读 · 2014年12月31日

压缩感知与稀疏信号恢复

国家自然科学基金

2+阅读 · 2014年12月31日

稀疏性在数字几何处理中的应用研究

国家自然科学基金

0+阅读 · 2013年12月31日

燃烧流和交通流的初边值问题

国家自然科学基金

0+阅读 · 2013年12月31日

镧系元素掺杂BiFeO3薄膜铁电与压电各向异性研究

国家自然科学基金

0+阅读 · 2011年12月31日

渗流及相关随机系统的极限行为研究

国家自然科学基金

0+阅读 · 2009年12月31日

遍历哈密顿系统的谱理论

国家自然科学基金

0+阅读 · 2009年12月31日

利用GPS与IM/WS干涉测量监测鲜水河断层变形

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员