音响器: Keyword Spointting 的革命变换器 (Audiomer: A Convolutional Transformer for Keyword Spotting) - 专知论文

会员服务 ·

0

Performer · 变换 · state-of-the-art · 卷积 · 残差网络 ·

2021 年 9 月 21 日

Audiomer: A Convolutional Transformer for Keyword Spotting

翻译：音响器: Keyword Spointting 的革命变换器

Surya Kant Sahu,Sai Mitheran,Juhi Kamdar,Meet Gandhi

from arxiv, Submitted to NeurIPS 2021 ENLSP Workshop

Transformers have seen an unprecedented rise in Natural Language Processing and Computer Vision tasks. However, in audio tasks, they are either infeasible to train due to extremely large sequence length of audio waveforms or reach competitive performance after feature extraction through Fourier-based methods, incurring a loss-floor. In this work, we introduce an architecture, Audiomer, where we combine 1D Residual Networks with Performer Attention to achieve state-of-the-art performance in Keyword Spotting with raw audio waveforms, out-performing all previous methods while also being computationally cheaper, much more parameter and data-efficient. Audiomer allows for deployment in compute-constrained devices and training on smaller datasets.

翻译：然而,在音频任务中,由于音频波形的序列长度非常之大,或者在通过基于Fourier的方法进行地貌提取后达到竞争性性能,从而造成损失。在这项工作中,我们引入了一个架构,即1D残余网络与表演者关注结合起来,在关键字中实现最先进的性能,即与原始音频波形相匹配,优于以往所有方法,同时计算成本低、参数高得多、数据效率高。音频可以部署在计算受限制的装置和较小数据集的培训中。

0

相关内容

Performer

【ICLR2021】彩色化变换器，Colorization Transformer

【ICLR2021】彩色化变换器，Colorization Transformer

专知会员服务

10+阅读 · 2021年2月9日

最新《Transformers模型》教程，64页ppt

最新《Transformers模型》教程，64页ppt

专知会员服务

320+阅读 · 2020年11月26日

【Google】大迁移：通用视觉表示学习，General Visual Representation Learning

【Google】大迁移：通用视觉表示学习，General Visual Representation Learning

专知会员服务

37+阅读 · 2020年5月9日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【Google无监督大规模视觉表示迁移】Large Scale Learning of General Visual Representations for Transfer

【Google无监督大规模视觉表示迁移】Large Scale Learning of General Visual Representations for Transfer

专知会员服务

12+阅读 · 2020年1月7日

【Google论文】ALBERT:自我监督学习语言表达的精简BERT

【Google论文】ALBERT:自我监督学习语言表达的精简BERT

专知会员服务

24+阅读 · 2019年11月4日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

53+阅读 · 2019年9月29日

RoBERTa中文预训练模型：RoBERTa for Chinese

RoBERTa中文预训练模型：RoBERTa for Chinese

PaperWeekly

57+阅读 · 2019年9月16日

BERT/Transformer/迁移学习NLP资源大列表

BERT/Transformer/迁移学习NLP资源大列表

专知

19+阅读 · 2019年6月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

超强合集：OCR文本检测干货汇总（含论文、源码、demo等资源）

超强合集：OCR文本检测干货汇总（含论文、源码、demo等资源）

极市平台

33+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

分布式TensorFlow入门指南

分布式TensorFlow入门指南

机器学习研究会

4+阅读 · 2017年11月28日

gan生成图像at 1024² 的代码论文

gan生成图像at 1024² 的代码论文

CreateAMind

4+阅读 · 2017年10月31日

【推荐】全卷积语义分割综述

【推荐】全卷积语义分割综述

机器学习研究会

19+阅读 · 2017年8月31日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

Attention Bottlenecks for Multimodal Fusion

Arxiv

31+阅读 · 2021年6月30日

VX2TEXT: End-to-End Learning of Video-Based Text Generation From Multimodal Inputs

Arxiv

3+阅读 · 2021年1月29日

MelGAN: Generative Adversarial Networks for Conditional Waveform Synthesis

MelGAN: Generative Adversarial Networks for Conditional Waveform Synthesis

Arxiv

7+阅读 · 2019年10月8日

Convolutional Self-Attention Network

Arxiv

6+阅读 · 2019年4月8日

Pay Less Attention with Lightweight and Dynamic Convolutions

Pay Less Attention with Lightweight and Dynamic Convolutions

Arxiv

4+阅读 · 2019年1月29日

Graph Convolutional Networks for Text Classification

Arxiv

11+阅读 · 2018年10月17日

Detection and Segmentation of Manufacturing Defects with Convolutional Neural Networks and Transfer Learning

Detection and Segmentation of Manufacturing Defects with Convolutional Neural Networks and Transfer Learning

Arxiv

3+阅读 · 2018年8月7日

Tiny SSD: A Tiny Single-shot Detection Deep Convolutional Neural Network for Real-time Embedded Object Detection

Arxiv

7+阅读 · 2018年2月19日

Convolutional Sequence to Sequence Learning

Arxiv

4+阅读 · 2017年7月25日

Caffeinated FPGAs: FPGA Framework For Convolutional Neural Networks

Arxiv

10+阅读 · 2016年9月30日

VIP会员

文章信息

相关主题

state-of-the-art

相关VIP内容

【ICLR2021】彩色化变换器，Colorization Transformer

【ICLR2021】彩色化变换器，Colorization Transformer

专知会员服务

10+阅读 · 2021年2月9日

最新《Transformers模型》教程，64页ppt

最新《Transformers模型》教程，64页ppt

专知会员服务

320+阅读 · 2020年11月26日

【Google】大迁移：通用视觉表示学习，General Visual Representation Learning

【Google】大迁移：通用视觉表示学习，General Visual Representation Learning

专知会员服务

37+阅读 · 2020年5月9日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【Google无监督大规模视觉表示迁移】Large Scale Learning of General Visual Representations for Transfer

【Google无监督大规模视觉表示迁移】Large Scale Learning of General Visual Representations for Transfer

专知会员服务

12+阅读 · 2020年1月7日

【Google论文】ALBERT:自我监督学习语言表达的精简BERT

【Google论文】ALBERT:自我监督学习语言表达的精简BERT

专知会员服务

24+阅读 · 2019年11月4日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

53+阅读 · 2019年9月29日

热门VIP内容

开通专知VIP会员享更多权益服务

人工智能治理的未来

模态感知的特征匹配：单一模态与跨模态技术的全面综述

无监督行人重识别研究综述

【牛津博士论文】面向神经影像应用的可扩展且可解释的空间模型

相关资讯

RoBERTa中文预训练模型：RoBERTa for Chinese

RoBERTa中文预训练模型：RoBERTa for Chinese

PaperWeekly

57+阅读 · 2019年9月16日

BERT/Transformer/迁移学习NLP资源大列表

BERT/Transformer/迁移学习NLP资源大列表

专知

19+阅读 · 2019年6月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

超强合集：OCR文本检测干货汇总（含论文、源码、demo等资源）

超强合集：OCR文本检测干货汇总（含论文、源码、demo等资源）

极市平台

33+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

分布式TensorFlow入门指南

分布式TensorFlow入门指南

机器学习研究会

4+阅读 · 2017年11月28日

gan生成图像at 1024² 的代码论文

gan生成图像at 1024² 的代码论文

CreateAMind

4+阅读 · 2017年10月31日

【推荐】全卷积语义分割综述

【推荐】全卷积语义分割综述

机器学习研究会

19+阅读 · 2017年8月31日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

相关论文

Attention Bottlenecks for Multimodal Fusion

Arxiv

31+阅读 · 2021年6月30日

VX2TEXT: End-to-End Learning of Video-Based Text Generation From Multimodal Inputs

Arxiv

3+阅读 · 2021年1月29日

MelGAN: Generative Adversarial Networks for Conditional Waveform Synthesis

MelGAN: Generative Adversarial Networks for Conditional Waveform Synthesis

Arxiv

7+阅读 · 2019年10月8日

Convolutional Self-Attention Network

Arxiv

6+阅读 · 2019年4月8日

Pay Less Attention with Lightweight and Dynamic Convolutions

Pay Less Attention with Lightweight and Dynamic Convolutions

Arxiv

4+阅读 · 2019年1月29日

Graph Convolutional Networks for Text Classification

Arxiv

11+阅读 · 2018年10月17日

Detection and Segmentation of Manufacturing Defects with Convolutional Neural Networks and Transfer Learning

Detection and Segmentation of Manufacturing Defects with Convolutional Neural Networks and Transfer Learning

Arxiv

3+阅读 · 2018年8月7日

Tiny SSD: A Tiny Single-shot Detection Deep Convolutional Neural Network for Real-time Embedded Object Detection

Arxiv

7+阅读 · 2018年2月19日

Convolutional Sequence to Sequence Learning

Arxiv

4+阅读 · 2017年7月25日

Caffeinated FPGAs: FPGA Framework For Convolutional Neural Networks

Arxiv

10+阅读 · 2016年9月30日

微信扫码咨询专知VIP会员