E2E 双塑料封装编码器 ASR 模型中的 E2E 分割 (E2E Segmentation in a Two-Pass Cascaded Encoder ASR Model) - 专知论文

会员服务 ·

0

级联 · E2E · 语音识别 · MoDELS · MS ·

2022 年 11 月 28 日

E2E Segmentation in a Two-Pass Cascaded Encoder ASR Model

翻译：E2E 双塑料封装编码器 ASR 模型中的 E2E 分割

W. Ronny Huang,Shuo-Yiin Chang,Tara N. Sainath,Yanzhang He,David Rybach,Robert David,Rohit Prabhavalkar,Cyril Allauzen,Cal Peyser,Trevor D. Strohman

We explore unifying a neural segmenter with two-pass cascaded encoder ASR into a single model. A key challenge is allowing the segmenter (which runs in real-time, synchronously with the decoder) to finalize the 2nd pass (which runs 900 ms behind real-time) without introducing user-perceived latency or deletion errors during inference. We propose a design where the neural segmenter is integrated with the causal 1st pass decoder to emit a end-of-segment (EOS) signal in real-time. The EOS signal is then used to finalize the non-causal 2nd pass. We experiment with different ways to finalize the 2nd pass, and find that a novel dummy frame injection strategy allows for simultaneous high quality 2nd pass results and low finalization latency. On a real-world long-form captioning task (YouTube), we achieve 2.4% relative WER and 140 ms EOS latency gains over a baseline VAD-based segmenter with the same cascaded encoder.

翻译：我们探索将神经元分解器与双通道级联编码器 ASR 合并为单一模型。关键的挑战是如何让分解器( 实时运行, 与解码器同步运行) 最终完成第二关口( 实时后运行900 ms ), 而不在推断过程中引入用户感知的延缓率或删除错误。我们提出一个设计, 将神经元分解器与因果一传出解码器结合, 以便实时发出一个断层信号。然后, EOS 信号被用于最终完成非闭路口第二关口。我们尝试了不同的方式来最终完成第二关口, 并发现新颖的假框架注射策略可以同时带来高质量的第二关传承结果和低封存。在现实世界的长式字幕任务( YouTube) 上, 我们实现了2.4% 相对 WER 和 140 ms EOS 延缓存率收益, 超过基于基线VAD 的分解器与同一级联的分解器的分解器。

0

相关内容

【CVPR 2022】实时实例分割的稀疏实例激活，Sparse Instance Activation for Real-Time Instance Segmentation

【CVPR 2022】实时实例分割的稀疏实例激活，Sparse Instance Activation for Real-Time Instance Segmentation

专知会员服务

8+阅读 · 2022年3月12日

【CVPR 2022】基于Tracklet查询和建议的高效视频实例分割，Efficient Video Instance Segmentation via Tracklet Query and Proposal

【CVPR 2022】基于Tracklet查询和建议的高效视频实例分割，Efficient Video Instance Segmentation via Tracklet Query and Proposal

专知会员服务

16+阅读 · 2022年3月3日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium5

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium5

中国图象图形学学会CSIG

1+阅读 · 2021年11月11日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

中国图象图形学学会CSIG

0+阅读 · 2021年11月10日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

ATP13A2基因亚型Ala746Thr和Thr12met突变与新疆维吾尔族早发型和家族型帕金森病临床的相关研究

国家自然科学基金

0+阅读 · 2014年12月31日

TMS1基因响应高温胁迫和ER Stress的分子机制

国家自然科学基金

0+阅读 · 2014年12月31日

玉米穗粒数形成的关键基因克隆与功能解析

国家自然科学基金

0+阅读 · 2013年12月31日

窒息死亡特异性microRNA标志物和死亡时间推断内参指标的鉴定

国家自然科学基金

0+阅读 · 2012年12月31日

葡萄果实早熟性状QTL定位及候选基因分析

国家自然科学基金

0+阅读 · 2012年12月31日

GSK-3β调控血管平滑肌细胞特异性转录因子Myocardin对动脉粥样硬化斑块形成作用及分子机制

国家自然科学基金

0+阅读 · 2012年12月31日

基于影子系统的流媒体直播平台

国家自然科学基金

1+阅读 · 2012年12月31日

基于转录组测序筛选马铃薯块茎休眠与发芽相关基因及功能鉴定

国家自然科学基金

0+阅读 · 2011年12月31日

病理性近视易感基因研究

国家自然科学基金

0+阅读 · 2009年12月31日

基于Sparse-Land模型的SAR图像噪声抑制与分割

国家自然科学基金

0+阅读 · 2009年12月31日

Execution-based Code Generation using Deep Reinforcement Learning

Arxiv

0+阅读 · 2023年1月31日

SeaFormer: Squeeze-enhanced Axial Transformer for Mobile Semantic Segmentation

Arxiv

0+阅读 · 2023年1月30日

Perceptual evaluation of listener envelopment using spatial granular synthesis

Arxiv

0+阅读 · 2023年1月30日

Adversarial Style Augmentation for Domain Generalization

Arxiv

0+阅读 · 2023年1月30日

JDSR-GAN: Constructing An Efficient Joint Learning Network for Masked Face Super-Resolution

Arxiv

0+阅读 · 2023年1月30日

Achieving Timestamp Prediction While Recognizing with Non-Autoregressive End-to-End ASR Model

Arxiv

0+阅读 · 2023年1月29日

Dual-View Selective Instance Segmentation Network for Unstained Live Adherent Cells in Differential Interference Contrast Images

Arxiv

0+阅读 · 2023年1月27日

Temporal Relational Modeling with Self-Supervision for Action Segmentation

Arxiv

13+阅读 · 2020年12月14日

Reverse Attention for Salient Object Detection

Arxiv

11+阅读 · 2019年4月15日

Domain Adaptive Faster R-CNN for Object Detection in the Wild

Arxiv

10+阅读 · 2018年3月8日

VIP会员

文章信息

相关主题

相关VIP内容

【CVPR 2022】实时实例分割的稀疏实例激活，Sparse Instance Activation for Real-Time Instance Segmentation

【CVPR 2022】实时实例分割的稀疏实例激活，Sparse Instance Activation for Real-Time Instance Segmentation

专知会员服务

8+阅读 · 2022年3月12日

【CVPR 2022】基于Tracklet查询和建议的高效视频实例分割，Efficient Video Instance Segmentation via Tracklet Query and Proposal

【CVPR 2022】基于Tracklet查询和建议的高效视频实例分割，Efficient Video Instance Segmentation via Tracklet Query and Proposal

专知会员服务

16+阅读 · 2022年3月3日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

从社会学实验到行为仿真：理解基于Agent的观点动力学建模思维

中英文版《GPT-5 System Card速览》报告

ACL 2025 | 大模型结构化知识提示的泛化能力研究

【普林斯顿博士论文】大型模型的高效推理

相关资讯

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium5

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium5

中国图象图形学学会CSIG

1+阅读 · 2021年11月11日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

中国图象图形学学会CSIG

0+阅读 · 2021年11月10日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

Execution-based Code Generation using Deep Reinforcement Learning

Arxiv

0+阅读 · 2023年1月31日

SeaFormer: Squeeze-enhanced Axial Transformer for Mobile Semantic Segmentation

Arxiv

0+阅读 · 2023年1月30日

Perceptual evaluation of listener envelopment using spatial granular synthesis

Arxiv

0+阅读 · 2023年1月30日

Adversarial Style Augmentation for Domain Generalization

Arxiv

0+阅读 · 2023年1月30日

JDSR-GAN: Constructing An Efficient Joint Learning Network for Masked Face Super-Resolution

Arxiv

0+阅读 · 2023年1月30日

Achieving Timestamp Prediction While Recognizing with Non-Autoregressive End-to-End ASR Model

Arxiv

0+阅读 · 2023年1月29日

Dual-View Selective Instance Segmentation Network for Unstained Live Adherent Cells in Differential Interference Contrast Images

Arxiv

0+阅读 · 2023年1月27日

Temporal Relational Modeling with Self-Supervision for Action Segmentation

Arxiv

13+阅读 · 2020年12月14日

Reverse Attention for Salient Object Detection

Arxiv

11+阅读 · 2019年4月15日

Domain Adaptive Faster R-CNN for Object Detection in the Wild

Arxiv

10+阅读 · 2018年3月8日

相关基金

ATP13A2基因亚型Ala746Thr和Thr12met突变与新疆维吾尔族早发型和家族型帕金森病临床的相关研究

国家自然科学基金

0+阅读 · 2014年12月31日

TMS1基因响应高温胁迫和ER Stress的分子机制

国家自然科学基金

0+阅读 · 2014年12月31日

玉米穗粒数形成的关键基因克隆与功能解析

国家自然科学基金

0+阅读 · 2013年12月31日

窒息死亡特异性microRNA标志物和死亡时间推断内参指标的鉴定

国家自然科学基金

0+阅读 · 2012年12月31日

葡萄果实早熟性状QTL定位及候选基因分析

国家自然科学基金

0+阅读 · 2012年12月31日

GSK-3β调控血管平滑肌细胞特异性转录因子Myocardin对动脉粥样硬化斑块形成作用及分子机制

国家自然科学基金

0+阅读 · 2012年12月31日

基于影子系统的流媒体直播平台

国家自然科学基金

1+阅读 · 2012年12月31日

基于转录组测序筛选马铃薯块茎休眠与发芽相关基因及功能鉴定

国家自然科学基金

0+阅读 · 2011年12月31日

病理性近视易感基因研究

国家自然科学基金

0+阅读 · 2009年12月31日

基于Sparse-Land模型的SAR图像噪声抑制与分割

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员