SepTr: 用于音频分光处理的可分离变换器 (SepTr: Separable Transformer for Audio Spectrogram Processing) - 专知论文

会员服务 ·

0

变换 · 分离的 · Attention · Vision · Processing（编程语言） ·

2022 年 6 月 20 日

SepTr: Separable Transformer for Audio Spectrogram Processing

翻译：SepTr: 用于音频分光处理的可分离变换器

Nicolae-Catalin Ristea,Radu Tudor Ionescu,Fahad Shahbaz Khan

from arxiv, Accepted at INTERSPEECH 2022

Following the successful application of vision transformers in multiple computer vision tasks, these models have drawn the attention of the signal processing community. This is because signals are often represented as spectrograms (e.g. through Discrete Fourier Transform) which can be directly provided as input to vision transformers. However, naively applying transformers to spectrograms is suboptimal. Since the axes represent distinct dimensions, i.e. frequency and time, we argue that a better approach is to separate the attention dedicated to each axis. To this end, we propose the Separable Transformer (SepTr), an architecture that employs two transformer blocks in a sequential manner, the first attending to tokens within the same time interval, and the second attending to tokens within the same frequency bin. We conduct experiments on three benchmark data sets, showing that our separable architecture outperforms conventional vision transformers and other state-of-the-art methods. Unlike standard transformers, SepTr linearly scales the number of trainable parameters with the input size, thus having a lower memory footprint. Our code is available as open source at https://github.com/ristea/septr.

翻译：在多次计算机视觉任务中成功应用了视觉变压器之后,这些模型引起了信号处理界的注意。这是因为信号通常被作为光谱图( 例如通过分光变换器) 来代表, 可以直接作为向视觉变压器的输入。但是, 将变压器应用到光谱图中是不完美的。由于轴代表不同的维度, 即频率和时间, 我们主张更好的办法是将每个轴的注意力分开。为此, 我们建议采用 Septrable 变压器( SepTr), 这个结构以相继方式使用两个变压器块( separble 变压器), 这个结构以两个变压器块, 在同一时间间隔内使用第一个标记, 第二次表示符号。我们用三个基准数据集进行实验, 显示我们的变压结构超越了常规变压器和其他最先进的方法。与标准变压器不同, SepTr 线度尺度的可训练参数数量与输入尺寸不同, 因此内存留痕迹较低。我们的代码作为开放源 http://pr/ 。

0

相关内容

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

CVPR 2020 论文开源项目合集

专知会员服务

110+阅读 · 2020年3月12日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

31+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

【论文推荐】最新七篇图像检索相关论文—草图、Tie-Aware、场景图解析、叠加跨注意力机制、深度哈希、人群估计

【论文推荐】最新七篇图像检索相关论文—草图、Tie-Aware、场景图解析、叠加跨注意力机制、深度哈希、人群估计

专知

10+阅读 · 2018年4月22日

【推荐】YOLO实时目标检测(6fps)

【推荐】YOLO实时目标检测(6fps)

机器学习研究会

20+阅读 · 2017年11月5日

2型糖尿病相关的线粒体tRNA突变的致病机理研究

国家自然科学基金

0+阅读 · 2015年12月31日

局域表面等离激元增强染料敏化太阳电池性能及机理研究

国家自然科学基金

0+阅读 · 2015年12月31日

功能性多肽类荧光探针用于肿瘤的标记与检测

国家自然科学基金

0+阅读 · 2014年12月31日

miRNA-590-3p调控的PDGF-BB信号通路在心房重构致心房颤动中的作用研究

国家自然科学基金

0+阅读 · 2014年12月31日

功能梯度夹层双曲抛物壳非线性动力学研究

国家自然科学基金

0+阅读 · 2014年12月31日

一种无直流储能元件的电能传输控制新技术：相位和幅值可控交-交变换器

国家自然科学基金

0+阅读 · 2014年12月31日

原子气体中多波混频信号的空间调制研究

国家自然科学基金

0+阅读 · 2013年12月31日

鲜水河断裂震间形变时空分布特征及地震危险性分析

国家自然科学基金

0+阅读 · 2012年12月31日

TR3相互作用新蛋白机理研究

国家自然科学基金

1+阅读 · 2008年12月31日

基于小波有限元的探地雷达正演模拟及偏移处理

国家自然科学基金

0+阅读 · 2008年12月31日

Investigating Efficiently Extending Transformers for Long Input Summarization

Arxiv

0+阅读 · 2022年8月8日

An End-to-End Transformer Model for Crowd Localization

Arxiv

0+阅读 · 2022年8月8日

Efficient Training of Neural Transducer for Speech Recognition

Arxiv

0+阅读 · 2022年8月8日

FastSpeech 2: Fast and High-Quality End-to-End Text to Speech

Arxiv

0+阅读 · 2022年8月8日

Contextual Search in the Presence of Adversarial Corruptions

Arxiv

0+阅读 · 2022年8月6日

Learning Spatiotemporal Frequency-Transformer for Compressed Video Super-Resolution

Arxiv

0+阅读 · 2022年8月5日

Transformers in Time Series: A Survey

Arxiv

34+阅读 · 2022年2月15日

Nested Hierarchical Transformer: Towards Accurate, Data-Efficient and Interpretable Visual Understanding

Arxiv

12+阅读 · 2021年12月30日

End-to-End Multi-Task Learning with Attention

Arxiv

19+阅读 · 2018年3月28日

Attention Is All You Need

Arxiv

27+阅读 · 2017年12月6日

VIP会员

文章信息

相关主题

Processing（编程语言）

相关VIP内容

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

CVPR 2020 论文开源项目合集

专知会员服务

110+阅读 · 2020年3月12日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

31+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《战区安全决策课程体系》最新244页

《"无人机航母"原型平台》

任务规划与地形分析：现代复杂环境作战导航体系

《攻击场景描述形式化模型研究》

相关资讯

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

【论文推荐】最新七篇图像检索相关论文—草图、Tie-Aware、场景图解析、叠加跨注意力机制、深度哈希、人群估计

【论文推荐】最新七篇图像检索相关论文—草图、Tie-Aware、场景图解析、叠加跨注意力机制、深度哈希、人群估计

专知

10+阅读 · 2018年4月22日

【推荐】YOLO实时目标检测(6fps)

【推荐】YOLO实时目标检测(6fps)

机器学习研究会

20+阅读 · 2017年11月5日

相关论文

Investigating Efficiently Extending Transformers for Long Input Summarization

Arxiv

0+阅读 · 2022年8月8日

An End-to-End Transformer Model for Crowd Localization

Arxiv

0+阅读 · 2022年8月8日

Efficient Training of Neural Transducer for Speech Recognition

Arxiv

0+阅读 · 2022年8月8日

FastSpeech 2: Fast and High-Quality End-to-End Text to Speech

Arxiv

0+阅读 · 2022年8月8日

Contextual Search in the Presence of Adversarial Corruptions

Arxiv

0+阅读 · 2022年8月6日

Learning Spatiotemporal Frequency-Transformer for Compressed Video Super-Resolution

Arxiv

0+阅读 · 2022年8月5日

Transformers in Time Series: A Survey

Arxiv

34+阅读 · 2022年2月15日

Nested Hierarchical Transformer: Towards Accurate, Data-Efficient and Interpretable Visual Understanding

Arxiv

12+阅读 · 2021年12月30日

End-to-End Multi-Task Learning with Attention

Arxiv

19+阅读 · 2018年3月28日

Attention Is All You Need

Arxiv

27+阅读 · 2017年12月6日

相关基金

2型糖尿病相关的线粒体tRNA突变的致病机理研究

国家自然科学基金

0+阅读 · 2015年12月31日

局域表面等离激元增强染料敏化太阳电池性能及机理研究

国家自然科学基金

0+阅读 · 2015年12月31日

功能性多肽类荧光探针用于肿瘤的标记与检测

国家自然科学基金

0+阅读 · 2014年12月31日

miRNA-590-3p调控的PDGF-BB信号通路在心房重构致心房颤动中的作用研究

国家自然科学基金

0+阅读 · 2014年12月31日

功能梯度夹层双曲抛物壳非线性动力学研究

国家自然科学基金

0+阅读 · 2014年12月31日

一种无直流储能元件的电能传输控制新技术：相位和幅值可控交-交变换器

国家自然科学基金

0+阅读 · 2014年12月31日

原子气体中多波混频信号的空间调制研究

国家自然科学基金

0+阅读 · 2013年12月31日

鲜水河断裂震间形变时空分布特征及地震危险性分析

国家自然科学基金

0+阅读 · 2012年12月31日

TR3相互作用新蛋白机理研究

国家自然科学基金

1+阅读 · 2008年12月31日

基于小波有限元的探地雷达正演模拟及偏移处理

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员