调查加强反恐委员会的触发关注流模式调查 (An Investigation of Enhancing CTC Model for Triggered Attention-based Streaming ASR) - 专知论文

会员服务 ·

0

流 · 语音识别 · Performer · 模型评估 · INFORMS ·

2021 年 10 月 20 日

An Investigation of Enhancing CTC Model for Triggered Attention-based Streaming ASR

翻译：调查加强反恐委员会的触发关注流模式调查

Huaibo Zhao,Yosuke Higuchi,Tetsuji Ogawa,Tetsunori Kobayashi

from arxiv, Accepted to APSIPA 2021

In the present paper, an attempt is made to combine Mask-CTC and the triggered attention mechanism to construct a streaming end-to-end automatic speech recognition (ASR) system that provides high performance with low latency. The triggered attention mechanism, which performs autoregressive decoding triggered by the CTC spike, has shown to be effective in streaming ASR. However, in order to maintain high accuracy of alignment estimation based on CTC outputs, which is the key to its performance, it is inevitable that decoding should be performed with some future information input (i.e., with higher latency). It should be noted that in streaming ASR, it is desirable to be able to achieve high recognition accuracy while keeping the latency low. Therefore, the present study aims to achieve highly accurate streaming ASR with low latency by introducing Mask-CTC, which is capable of learning feature representations that anticipate future information (i.e., that can consider long-term contexts), to the encoder pre-training. Experimental comparisons conducted using WSJ data demonstrate that the proposed method achieves higher accuracy with lower latency than the conventional triggered attention-based streaming ASR system.

翻译：在本文件中,试图将Mask-CT和触发注意机制结合起来,以构建一个端到端自动语音识别(ASR)系统,提供高性能,低潜潜伏;触发注意机制,由CTC激增引发自动递减解码,在流出ASR方面证明是有效的;然而,为了保持基于CTC产出的校正估计的高度准确性,这是其性能的关键,因此,在未来的信息输入(即,高延缓度)下解码是不可避免的;应当指出,在流出ASR时,最好能够实现高度识别准确性,同时保持低潜伏;因此,本研究报告旨在通过引入Make-CT来非常精确地流出ASR,从而能够了解预测未来信息的特征描述(即,可以考虑长期背景)到编译前训练阶段。使用WSJ数据进行的实验性比较表明,拟议方法的精确度比常规的ASR导引引流系统要低。

0

相关内容

【SIGIR2021】ScaleFreeCTR：超大规模Embedding推荐模型分布式训练系统

专知会员服务

28+阅读 · 2021年4月26日

Python分布式计算，171页pdf，Distributed Computing with Python

Python分布式计算，171页pdf，Distributed Computing with Python

专知会员服务

108+阅读 · 2020年5月3日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【Yoshua Bengio新论文】多任务自监督学习语音识别，MULTI-TASK SELF-SUPERVISED LEARNING FOR ROBUST SPEECH RECOGNITION

【Yoshua Bengio新论文】多任务自监督学习语音识别，MULTI-TASK SELF-SUPERVISED LEARNING FOR ROBUST SPEECH RECOGNITION

专知会员服务

39+阅读 · 2020年1月30日

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

10+阅读 · 2019年10月24日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Diganta Misra等人提出新激活函数Mish，在一些任务上超越RuLU

Diganta Misra等人提出新激活函数Mish，在一些任务上超越RuLU

专知会员服务

15+阅读 · 2019年10月15日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

已删除

将门创投

5+阅读 · 2017年11月20日

Relaxed Attention: A Simple Method to Boost Performance of End-to-End Automatic Speech Recognition

Arxiv

0+阅读 · 2021年12月15日

Using Prior Knowledge to Guide BERT's Attention in Semantic Textual Matching Tasks

Arxiv

10+阅读 · 2021年2月22日

Adaptive Attentional Network for Few-Shot Knowledge Graph Completion

Arxiv

17+阅读 · 2020年10月19日

Not All Attention Is Needed: Gated Attention Network for Sequence Data

Arxiv

3+阅读 · 2019年12月1日

An Attention-Gated Convolutional Neural Network for Sentence Classification

An Attention-Gated Convolutional Neural Network for Sentence Classification

Arxiv

4+阅读 · 2018年12月28日

End-to-end Speech Recognition with Word-based RNN Language Models

End-to-end Speech Recognition with Word-based RNN Language Models

Arxiv

3+阅读 · 2018年8月8日

Multi-Cast Attention Networks for Retrieval-based Question Answering and Response Prediction

Arxiv

3+阅读 · 2018年6月3日

End-to-End Multi-Task Learning with Attention

Arxiv

19+阅读 · 2018年3月28日

Dual Attention Matching Network for Context-Aware Feature Sequence based Person Re-Identification

Arxiv

5+阅读 · 2018年3月27日

Multilingual Training and Cross-lingual Adaptation on CTC-based Acoustic Model

Arxiv

7+阅读 · 2018年1月23日

VIP会员

文章信息

相关主题

相关VIP内容

【SIGIR2021】ScaleFreeCTR：超大规模Embedding推荐模型分布式训练系统

专知会员服务

28+阅读 · 2021年4月26日

Python分布式计算，171页pdf，Distributed Computing with Python

Python分布式计算，171页pdf，Distributed Computing with Python

专知会员服务

108+阅读 · 2020年5月3日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【Yoshua Bengio新论文】多任务自监督学习语音识别，MULTI-TASK SELF-SUPERVISED LEARNING FOR ROBUST SPEECH RECOGNITION

【Yoshua Bengio新论文】多任务自监督学习语音识别，MULTI-TASK SELF-SUPERVISED LEARNING FOR ROBUST SPEECH RECOGNITION

专知会员服务

39+阅读 · 2020年1月30日

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

10+阅读 · 2019年10月24日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Diganta Misra等人提出新激活函数Mish，在一些任务上超越RuLU

Diganta Misra等人提出新激活函数Mish，在一些任务上超越RuLU

专知会员服务

15+阅读 · 2019年10月15日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

《步兵小单元山地严寒作战指南》美军最新条令200页

《联合作战概念的发展》最新报告

俄制无人机弹药

《复杂场景下自主着陆的模型预测控制技术》92页

相关资讯

已删除

将门创投

5+阅读 · 2017年11月20日

相关论文

Relaxed Attention: A Simple Method to Boost Performance of End-to-End Automatic Speech Recognition

Arxiv

0+阅读 · 2021年12月15日

Using Prior Knowledge to Guide BERT's Attention in Semantic Textual Matching Tasks

Arxiv

10+阅读 · 2021年2月22日

Adaptive Attentional Network for Few-Shot Knowledge Graph Completion

Arxiv

17+阅读 · 2020年10月19日

Not All Attention Is Needed: Gated Attention Network for Sequence Data

Arxiv

3+阅读 · 2019年12月1日

An Attention-Gated Convolutional Neural Network for Sentence Classification

An Attention-Gated Convolutional Neural Network for Sentence Classification

Arxiv

4+阅读 · 2018年12月28日

End-to-end Speech Recognition with Word-based RNN Language Models

End-to-end Speech Recognition with Word-based RNN Language Models

Arxiv

3+阅读 · 2018年8月8日

Multi-Cast Attention Networks for Retrieval-based Question Answering and Response Prediction

Arxiv

3+阅读 · 2018年6月3日

End-to-End Multi-Task Learning with Attention

Arxiv

19+阅读 · 2018年3月28日

Dual Attention Matching Network for Context-Aware Feature Sequence based Person Re-Identification

Arxiv

5+阅读 · 2018年3月27日

Multilingual Training and Cross-lingual Adaptation on CTC-based Acoustic Model

Arxiv

7+阅读 · 2018年1月23日

微信扫码咨询专知VIP会员