用流流式变换器测醒单词 (Wake Word Detection with Streaming Transformers) - 专知论文

会员服务 ·

0

变换 · Performer · 流 · MoDELS · Networking ·

2021 年 2 月 8 日

Wake Word Detection with Streaming Transformers

翻译：用流流式变换器测醒单词

Yiming Wang,Hang Lv,Daniel Povey,Lei Xie,Sanjeev Khudanpur

from arxiv, Accepted at IEEE ICASSP 2021. 5 pages, 3 figures

Modern wake word detection systems usually rely on neural networks for acoustic modeling. Transformers has recently shown superior performance over LSTM and convolutional networks in various sequence modeling tasks with their better temporal modeling power. However it is not clear whether this advantage still holds for short-range temporal modeling like wake word detection. Besides, the vanilla Transformer is not directly applicable to the task due to its non-streaming nature and the quadratic time and space complexity. In this paper we explore the performance of several variants of chunk-wise streaming Transformers tailored for wake word detection in a recently proposed LF-MMI system, including looking-ahead to the next chunk, gradient stopping, different positional embedding methods and adding same-layer dependency between chunks. Our experiments on the Mobvoi wake word dataset demonstrate that our proposed Transformer model outperforms the baseline convolution network by 25% on average in false rejection rate at the same false alarm rate with a comparable model size, while still maintaining linear complexity w.r.t. the sequence length.

翻译：现代觉醒单词探测系统通常依靠神经网络进行声学模型。变异器最近显示, 各种序列模拟任务中LSTM 和进化网络的性能优于LSTM 和进化网络, 具有更好的时间模型能力。但是, 尚不清楚这种优势是否仍然适用于短距离时间模型, 如测醒单词。此外, 香草变异器由于其非流性以及四面形时间和空间复杂性, 并不直接适用于任务。本文中我们探讨了最近提议的LF- MMI系统中为测醒单而专门设计的成块变形变形变形变形器的性能, 包括直视下一个块、梯度停止、不同位置嵌入法和在块间添加同层依赖性。我们在 Mobvoi 唤醒单词数据集上的实验表明, 我们提议的变形模型平均以类似模型大小的虚假拒绝率超过基线变动网络的25%, 平均以类似型号超速率, 但仍保持线性复杂 w.r. t 序列长度。

0

相关内容

最新《Transformers模型》教程，64页ppt

最新《Transformers模型》教程，64页ppt

专知会员服务

320+阅读 · 2020年11月26日

最新《时序分类:深度序列模型》教程，172页ppt

最新《时序分类:深度序列模型》教程，172页ppt

专知会员服务

43+阅读 · 2020年11月11日

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日

迁移学习简明教程，11页ppt

迁移学习简明教程，11页ppt

专知会员服务

108+阅读 · 2020年8月4日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

Transformer文本分类代码

Transformer文本分类代码

专知会员服务

118+阅读 · 2020年2月3日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【Google大脑Sara Sabour】胶囊架构（Capsule Architectures），附47页ppt

【Google大脑Sara Sabour】胶囊架构（Capsule Architectures），附47页ppt

专知会员服务

39+阅读 · 2019年11月24日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

一文读懂最强中文NLP预训练模型ERNIE

一文读懂最强中文NLP预训练模型ERNIE

AINLP

25+阅读 · 2019年10月22日

RoBERTa中文预训练模型：RoBERTa for Chinese

RoBERTa中文预训练模型：RoBERTa for Chinese

PaperWeekly

57+阅读 · 2019年9月16日

BERT/Transformer/迁移学习NLP资源大列表

BERT/Transformer/迁移学习NLP资源大列表

专知

19+阅读 · 2019年6月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

2018机器学习开源资源盘点

2018机器学习开源资源盘点

专知

6+阅读 · 2019年2月2日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

Facebook PyText 在 Github 上开源了

Facebook PyText 在 Github 上开源了

AINLP

7+阅读 · 2018年12月14日

已删除

雪球

6+阅读 · 2018年8月19日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

Transformer-based end-to-end speech recognition with residual Gaussian-based self-attention

Arxiv

0+阅读 · 2021年4月2日

Augmentation Strategies for Learning with Noisy Labels

Arxiv

0+阅读 · 2021年4月1日

Multi-rate attention architecture for fast streamable Text-to-speech spectrum modeling

Arxiv

0+阅读 · 2021年4月1日

On Layer Normalization in the Transformer Architecture

Arxiv

4+阅读 · 2020年2月12日

Pre-Training with Whole Word Masking for Chinese BERT

Arxiv

11+阅读 · 2019年6月19日

The Evolved Transformer

The Evolved Transformer

Arxiv

5+阅读 · 2019年1月30日

Dissecting Contextual Word Embeddings: Architecture and Representation

Dissecting Contextual Word Embeddings: Architecture and Representation

Arxiv

22+阅读 · 2018年8月27日

End-to-end Speech Recognition with Word-based RNN Language Models

End-to-end Speech Recognition with Word-based RNN Language Models

Arxiv

3+阅读 · 2018年8月8日

Self-Attention Recurrent Network for Saliency Detection

Self-Attention Recurrent Network for Saliency Detection

Arxiv

5+阅读 · 2018年8月5日

Mobile Video Object Detection with Temporally-Aware Feature Maps

Arxiv

11+阅读 · 2018年3月28日

VIP会员

文章信息

相关主题

相关VIP内容

最新《Transformers模型》教程，64页ppt

最新《Transformers模型》教程，64页ppt

专知会员服务

320+阅读 · 2020年11月26日

最新《时序分类:深度序列模型》教程，172页ppt

最新《时序分类:深度序列模型》教程，172页ppt

专知会员服务

43+阅读 · 2020年11月11日

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日

迁移学习简明教程，11页ppt

迁移学习简明教程，11页ppt

专知会员服务

108+阅读 · 2020年8月4日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

Transformer文本分类代码

Transformer文本分类代码

专知会员服务

118+阅读 · 2020年2月3日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【Google大脑Sara Sabour】胶囊架构（Capsule Architectures），附47页ppt

【Google大脑Sara Sabour】胶囊架构（Capsule Architectures），附47页ppt

专知会员服务

39+阅读 · 2019年11月24日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

热门VIP内容

开通专知VIP会员享更多权益服务

《毁灭算法：解析以色列在加沙的AI军事行动》

【COLT 2025最新教程】语言生成

以机器速度锁定目标：人工智能的能力与局限

【ICML2025】通过在线世界模型规划的持续强化学习

相关资讯

一文读懂最强中文NLP预训练模型ERNIE

一文读懂最强中文NLP预训练模型ERNIE

AINLP

25+阅读 · 2019年10月22日

RoBERTa中文预训练模型：RoBERTa for Chinese

RoBERTa中文预训练模型：RoBERTa for Chinese

PaperWeekly

57+阅读 · 2019年9月16日

BERT/Transformer/迁移学习NLP资源大列表

BERT/Transformer/迁移学习NLP资源大列表

专知

19+阅读 · 2019年6月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

2018机器学习开源资源盘点

2018机器学习开源资源盘点

专知

6+阅读 · 2019年2月2日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

Facebook PyText 在 Github 上开源了

Facebook PyText 在 Github 上开源了

AINLP

7+阅读 · 2018年12月14日

已删除

雪球

6+阅读 · 2018年8月19日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

相关论文

Transformer-based end-to-end speech recognition with residual Gaussian-based self-attention

Arxiv

0+阅读 · 2021年4月2日

Augmentation Strategies for Learning with Noisy Labels

Arxiv

0+阅读 · 2021年4月1日

Multi-rate attention architecture for fast streamable Text-to-speech spectrum modeling

Arxiv

0+阅读 · 2021年4月1日

On Layer Normalization in the Transformer Architecture

Arxiv

4+阅读 · 2020年2月12日

Pre-Training with Whole Word Masking for Chinese BERT

Arxiv

11+阅读 · 2019年6月19日

The Evolved Transformer

The Evolved Transformer

Arxiv

5+阅读 · 2019年1月30日

Dissecting Contextual Word Embeddings: Architecture and Representation

Dissecting Contextual Word Embeddings: Architecture and Representation

Arxiv

22+阅读 · 2018年8月27日

End-to-end Speech Recognition with Word-based RNN Language Models

End-to-end Speech Recognition with Word-based RNN Language Models

Arxiv

3+阅读 · 2018年8月8日

Self-Attention Recurrent Network for Saliency Detection

Self-Attention Recurrent Network for Saliency Detection

Arxiv

5+阅读 · 2018年8月5日

Mobile Video Object Detection with Temporally-Aware Feature Maps

Arxiv

11+阅读 · 2018年3月28日

微信扫码咨询专知VIP会员