ESPnet-ST-v2：多用途语音翻译工具包 (ESPnet-ST-v2: Multipurpose Spoken Language Translation Toolkit) - 专知论文

会员服务 ·

0

语音翻译 · 工具 · 元模型 · 离散 · 转录 ·

2023 年 4 月 11 日

ESPnet-ST-v2: Multipurpose Spoken Language Translation Toolkit

翻译：ESPnet-ST-v2：多用途语音翻译工具包

Brian Yan,Jiatong Shi,Yun Tang,Hirofumi Inaguma,Yifan Peng,Siddharth Dalmia,Peter Polák,Patrick Fernandes,Dan Berrebbi,Tomoki Hayashi,Xiaohui Zhang,Zhaoheng Ni,Moto Hira,Soumi Maiti,Juan Pino,Shinji Watanabe

from arxiv, There will be some major updates to the paper. Thus, withdrawn

ESPnet-ST-v2 is a revamp of the open-source ESPnet-ST toolkit necessitated by the broadening interests of the spoken language translation community. ESPnet-ST-v2 supports 1) offline speech-to-text translation (ST), 2) simultaneous speech-to-text translation (SST), and 3) offline speech-to-speech translation (S2ST) -- each task is supported with a wide variety of approaches, differentiating ESPnet-ST-v2 from other open source spoken language translation toolkits. This toolkit offers state-of-the-art architectures such as transducers, hybrid CTC/attention, multi-decoders with searchable intermediates, time-synchronous blockwise CTC/attention, Translatotron models, and direct discrete unit models. In this paper, we describe the overall design, example models for each task, and performance benchmarking behind ESPnet-ST-v2, which is publicly available at https://github.com/espnet/espnet.

翻译：ESPnet-ST-v2是开源工具包ESPnet-ST的升级版，这是由于口译翻译社区兴趣的扩大所必需的。ESPnet-ST-v2支持1）离线语音到文本翻译（ST），2）同声语音到文本翻译（SST）和3）离线语音到语音翻译（S2ST）--每个任务都支持各种方法，这使ESPnet-ST-v2与其他开源口语翻译工具包区分开来。该工具包提供最先进的架构，例如转录器、混合CTC/注意力、具有可搜索中间体的多解码器、同步块CTC/注意力、Translatotron模型和直接离散单元模型。在本文中，我们描述了ESPnet-ST-v2的总体设计、每个任务的示例模型以及性能基准测试。该工具包公开在https://github.com/espnet/espnet。

0

相关内容

语音翻译

通过计算机进行不同语言之间的直接语音翻译，辅助不同语言背景的人们进行沟通已经成为世界各国研究的重点。和一般的文本翻译不同，语音翻译需要把语音识别、机器翻译和语音合成三大技术进行集成，具有很大的挑战性。

自然语言处理中的注意力机制，Attention in Natural Language Processing

自然语言处理中的注意力机制，Attention in Natural Language Processing

专知会员服务

136+阅读 · 2020年5月30日

【论文翻译】NLP注意力机制综述论文翻译，Attention, please! A Critical Review of Neural Attention Models in Natural Language Processing

【论文翻译】NLP注意力机制综述论文翻译，Attention, please! A Critical Review of Neural Attention Models in Natural Language Processing

专知会员服务

96+阅读 · 2020年4月18日

CVPR 2020 论文开源项目合集

专知会员服务

110+阅读 · 2020年3月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

【Github】All4NLP：自然语言处理相关资源整理

【Github】All4NLP：自然语言处理相关资源整理

AINLP

23+阅读 · 2019年8月9日

谷歌发表的史上最强NLP模型BERT的官方代码和预训练模型可以下载了

谷歌发表的史上最强NLP模型BERT的官方代码和预训练模型可以下载了

AINLP

12+阅读 · 2018年11月1日

【代码资源】GAN | 七份最热GAN文章及代码分享（Github 1000+Stars）

【代码资源】GAN | 七份最热GAN文章及代码分享（Github 1000+Stars）

专知

13+阅读 · 2018年6月24日

推荐｜清华老师推荐30来项算法代码和工具包列表（开源）

推荐｜清华老师推荐30来项算法代码和工具包列表（开源）

全球人工智能

26+阅读 · 2018年3月26日

【推荐】图像分类必读开创性论文汇总

【推荐】图像分类必读开创性论文汇总

机器学习研究会

14+阅读 · 2017年8月15日

基于多关联数据融合的疾病相似度算法研究

国家自然科学基金

3+阅读 · 2015年12月31日

多类型时序逻辑程序设计

国家自然科学基金

0+阅读 · 2013年12月31日

Kronheimer-Nakajima quiver 模空间与有理曲面

国家自然科学基金

1+阅读 · 2013年12月31日

计算力学基本计算及可视化工具程序包的开发与集成

国家自然科学基金

2+阅读 · 2012年12月31日

多模式非线性显微光学3D成像应用于干细胞移植皮肤再生的生理机理研究

国家自然科学基金

0+阅读 · 2011年12月31日

RASR2: The RWTH ASR Toolkit for Generic Sequence-to-sequence Speech Recognition

Arxiv

0+阅读 · 2023年5月28日

Translatotron 3: Speech to Speech Translation with Monolingual Data

Arxiv

0+阅读 · 2023年5月27日

Inseq: An Interpretability Toolkit for Sequence Generation Models

Arxiv

0+阅读 · 2023年5月27日

VioLA: Unified Codec Language Models for Speech Recognition, Synthesis, and Translation

Arxiv

0+阅读 · 2023年5月25日

Brouhaha: multi-task training for voice activity detection, speech-to-noise ratio, and C50 room acoustics estimation

Arxiv

0+阅读 · 2023年5月25日

VIP会员

文章信息

相关主题

相关VIP内容

自然语言处理中的注意力机制，Attention in Natural Language Processing

自然语言处理中的注意力机制，Attention in Natural Language Processing

专知会员服务

136+阅读 · 2020年5月30日

【论文翻译】NLP注意力机制综述论文翻译，Attention, please! A Critical Review of Neural Attention Models in Natural Language Processing

【论文翻译】NLP注意力机制综述论文翻译，Attention, please! A Critical Review of Neural Attention Models in Natural Language Processing

专知会员服务

96+阅读 · 2020年4月18日

CVPR 2020 论文开源项目合集

专知会员服务

110+阅读 · 2020年3月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

大模型推理时代的知识编辑

《利用人工智能对军事行动进行建模》

【MIT博士论文】加速科学发现的因果建模实践算法

机器人、无人机与实时影像：应对城市爆炸威胁的三大技术方案

相关资讯

【Github】All4NLP：自然语言处理相关资源整理

【Github】All4NLP：自然语言处理相关资源整理

AINLP

23+阅读 · 2019年8月9日

谷歌发表的史上最强NLP模型BERT的官方代码和预训练模型可以下载了

谷歌发表的史上最强NLP模型BERT的官方代码和预训练模型可以下载了

AINLP

12+阅读 · 2018年11月1日

【代码资源】GAN | 七份最热GAN文章及代码分享（Github 1000+Stars）

【代码资源】GAN | 七份最热GAN文章及代码分享（Github 1000+Stars）

专知

13+阅读 · 2018年6月24日

推荐｜清华老师推荐30来项算法代码和工具包列表（开源）

推荐｜清华老师推荐30来项算法代码和工具包列表（开源）

全球人工智能

26+阅读 · 2018年3月26日

【推荐】图像分类必读开创性论文汇总

【推荐】图像分类必读开创性论文汇总

机器学习研究会

14+阅读 · 2017年8月15日

相关论文

RASR2: The RWTH ASR Toolkit for Generic Sequence-to-sequence Speech Recognition

Arxiv

0+阅读 · 2023年5月28日

Translatotron 3: Speech to Speech Translation with Monolingual Data

Arxiv

0+阅读 · 2023年5月27日

Inseq: An Interpretability Toolkit for Sequence Generation Models

Arxiv

0+阅读 · 2023年5月27日

VioLA: Unified Codec Language Models for Speech Recognition, Synthesis, and Translation

Arxiv

0+阅读 · 2023年5月25日

Brouhaha: multi-task training for voice activity detection, speech-to-noise ratio, and C50 room acoustics estimation

Arxiv

0+阅读 · 2023年5月25日

相关基金

基于多关联数据融合的疾病相似度算法研究

国家自然科学基金

3+阅读 · 2015年12月31日

多类型时序逻辑程序设计

国家自然科学基金

0+阅读 · 2013年12月31日

Kronheimer-Nakajima quiver 模空间与有理曲面

国家自然科学基金

1+阅读 · 2013年12月31日

计算力学基本计算及可视化工具程序包的开发与集成

国家自然科学基金

2+阅读 · 2012年12月31日

多模式非线性显微光学3D成像应用于干细胞移植皮肤再生的生理机理研究

国家自然科学基金

0+阅读 · 2011年12月31日

微信扫码咨询专知VIP会员