采用四级数据增强办法,建立网络-基于可靠事件本地化和检测的可靠声学模型</s> (A Four-Stage Data Augmentation Approach to ResNet-Conformer Based Acoustic Modeling for Sound Event Localization and Detection) - 专知论文

会员服务 ·

0

ACS · 数据增强 · MoDELS · TDM · 特化 ·

2023 年 3 月 7 日

A Four-Stage Data Augmentation Approach to ResNet-Conformer Based Acoustic Modeling for Sound Event Localization and Detection

翻译：采用四级数据增强办法,建立网络-基于可靠事件本地化和检测的可靠声学模型

Qing Wang,Jun Du,Hua-Xin Wu,Jia Pan,Feng Ma,Chin-Hui Lee

from arxiv, 13 pages, 8 figures, Accepted by Transactions on Audio, Speech and Language Processing

In this paper, we propose a novel four-stage data augmentation approach to ResNet-Conformer based acoustic modeling for sound event localization and detection (SELD). First, we explore two spatial augmentation techniques, namely audio channel swapping (ACS) and multi-channel simulation (MCS), to deal with data sparsity in SELD. ACS and MDS focus on augmenting the limited training data with expanding direction of arrival (DOA) representations such that the acoustic models trained with the augmented data are robust to localization variations of acoustic sources. Next, time-domain mixing (TDM) and time-frequency masking (TFM) are also investigated to deal with overlapping sound events and data diversity. Finally, ACS, MCS, TDM and TFM are combined in a step-by-step manner to form an effective four-stage data augmentation scheme. Tested on the Detection and Classification of Acoustic Scenes and Events (DCASE) 2020 data set, our proposed augmentation approach greatly improves the system performance, ranking our submitted system in the first place in the SELD task of the DCASE 2020 Challenge. Furthermore, we employ a ResNet-Conformer architecture to model both global and local context dependencies of an audio sequence and win the first place in the DCASE 2022 SELD evaluations.

翻译：在本文中,我们提出一个新的四阶段数据增强办法,用于ResNet-Conder软件的声学模型,用于声音事件定位和检测。首先,我们探索两种空间增强技术,即音信道互换(ACS)和多声道模拟(MCS),以应对SELD的数据宽度。ACS和MDS侧重于扩大有限的培训数据,扩大抵达方向(DOA),使经过强化数据培训的声学模型对声学源的本地化变异具有很强的功能。接下来,还调查了时间间隔混合(TDM)和时频遮罩(TFM),以处理重叠的声学事件和数据多样性。最后,ACS、MCS、TDM和TFM以逐步方式结合,形成一个有效的四阶段数据增强计划。根据2020年声学测和事件探测和分类(DCASE)数据集进行测试,我们提议的扩音法方法大大改进了系统性能,将我们提交的系统排在2020年DCASESE的SE-Conferive Airal Airmal Airs,我们采用了2020 Char-SySySure Airst 20SySEA Airview 和DC Airst 。此外,我们采用了了20SIS SAYSEA 20SIS Airst ASyal 和20SIS Asim 20SEA 。</s>

0

相关内容

ACS

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

剑桥大学《数据科学: 原理与实践》课程，附PPT下载

剑桥大学《数据科学: 原理与实践》课程，附PPT下载

专知会员服务

54+阅读 · 2021年1月20日

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

【推荐】YOLO实时目标检测(6fps)

【推荐】YOLO实时目标检测(6fps)

机器学习研究会

20+阅读 · 2017年11月5日

可解释的CNN

可解释的CNN

CreateAMind

17+阅读 · 2017年10月5日

玉米ZmbHLHA转录因子调控淀粉合成机制的研究

国家自然科学基金

0+阅读 · 2015年12月31日

Fe掺杂CuGaS2中间带薄膜材料的制备及光电特性

国家自然科学基金

0+阅读 · 2014年12月31日

高性能增强型Si基ZnO单晶薄膜晶体管的制备与研究

国家自然科学基金

0+阅读 · 2013年12月31日

玉米穗行数的遗传学基础研究

国家自然科学基金

0+阅读 · 2013年12月31日

Notch信号通路参与家蚕胚胎发育分子机制的研究

国家自然科学基金

0+阅读 · 2013年12月31日

CRMP2对MCAO大鼠的神经保护作用

国家自然科学基金

0+阅读 · 2012年12月31日

实时安全关键系统的建模、仿真与验证

国家自然科学基金

1+阅读 · 2012年12月31日

GNSS海面高度反演技术研究

国家自然科学基金

0+阅读 · 2011年12月31日

TR3相互作用新蛋白机理研究

国家自然科学基金

1+阅读 · 2008年12月31日

小电导Ca2+激活K+通道与ryanodine受体功能性偶联的研究

国家自然科学基金

0+阅读 · 2008年12月31日

Figments and Misalignments: A Framework for Fine-grained Crossmodal Misinformation Detection

Arxiv

0+阅读 · 2023年4月27日

Human-machine knowledge hybrid augmentation method for surface defect detection based few-data learning

Arxiv

0+阅读 · 2023年4月27日

A probabilistic approach for acoustic emission based monitoring techniques: with application to structural health monitoring

Arxiv

0+阅读 · 2023年4月26日

Coupling Global Context and Local Contents for Weakly-Supervised Semantic Segmentation

Arxiv

0+阅读 · 2023年4月26日

Use the Detection Transformer as a Data Augmenter

Arxiv

0+阅读 · 2023年4月26日

Node Feature Augmentation Vitaminizes Network Alignment

Arxiv

0+阅读 · 2023年4月25日

Selective Data Augmentation for Robust Speech Translation

Arxiv

0+阅读 · 2023年4月25日

Detection of Pavement Cracks by Deep Learning Models of Transformer and UNet

Arxiv

0+阅读 · 2023年4月25日

Mobile Video Object Detection with Temporally-Aware Feature Maps

Arxiv

11+阅读 · 2018年3月28日

A Robust Real-Time Automatic License Plate Recognition based on the YOLO Detector

Arxiv

13+阅读 · 2018年3月1日

VIP会员

文章信息

相关主题

相关VIP内容

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

剑桥大学《数据科学: 原理与实践》课程，附PPT下载

剑桥大学《数据科学: 原理与实践》课程，附PPT下载

专知会员服务

54+阅读 · 2021年1月20日

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

面向性能、成本效益、云边隐私与可信性的大小语言模型协作综述

乌克兰太空研究（2022-2024年） | 176页

【CMU博士论文】大型语言模型的隐性特性

国防领域人工智能走向何方？

相关资讯

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

【推荐】YOLO实时目标检测(6fps)

【推荐】YOLO实时目标检测(6fps)

机器学习研究会

20+阅读 · 2017年11月5日

可解释的CNN

可解释的CNN

CreateAMind

17+阅读 · 2017年10月5日

相关论文

Figments and Misalignments: A Framework for Fine-grained Crossmodal Misinformation Detection

Arxiv

0+阅读 · 2023年4月27日

Human-machine knowledge hybrid augmentation method for surface defect detection based few-data learning

Arxiv

0+阅读 · 2023年4月27日

A probabilistic approach for acoustic emission based monitoring techniques: with application to structural health monitoring

Arxiv

0+阅读 · 2023年4月26日

Coupling Global Context and Local Contents for Weakly-Supervised Semantic Segmentation

Arxiv

0+阅读 · 2023年4月26日

Use the Detection Transformer as a Data Augmenter

Arxiv

0+阅读 · 2023年4月26日

Node Feature Augmentation Vitaminizes Network Alignment

Arxiv

0+阅读 · 2023年4月25日

Selective Data Augmentation for Robust Speech Translation

Arxiv

0+阅读 · 2023年4月25日

Detection of Pavement Cracks by Deep Learning Models of Transformer and UNet

Arxiv

0+阅读 · 2023年4月25日

Mobile Video Object Detection with Temporally-Aware Feature Maps

Arxiv

11+阅读 · 2018年3月28日

A Robust Real-Time Automatic License Plate Recognition based on the YOLO Detector

Arxiv

13+阅读 · 2018年3月1日

相关基金

玉米ZmbHLHA转录因子调控淀粉合成机制的研究

国家自然科学基金

0+阅读 · 2015年12月31日

Fe掺杂CuGaS2中间带薄膜材料的制备及光电特性

国家自然科学基金

0+阅读 · 2014年12月31日

高性能增强型Si基ZnO单晶薄膜晶体管的制备与研究

国家自然科学基金

0+阅读 · 2013年12月31日

玉米穗行数的遗传学基础研究

国家自然科学基金

0+阅读 · 2013年12月31日

Notch信号通路参与家蚕胚胎发育分子机制的研究

国家自然科学基金

0+阅读 · 2013年12月31日

CRMP2对MCAO大鼠的神经保护作用

国家自然科学基金

0+阅读 · 2012年12月31日

实时安全关键系统的建模、仿真与验证

国家自然科学基金

1+阅读 · 2012年12月31日

GNSS海面高度反演技术研究

国家自然科学基金

0+阅读 · 2011年12月31日

TR3相互作用新蛋白机理研究

国家自然科学基金

1+阅读 · 2008年12月31日

小电导Ca2+激活K+通道与ryanodine受体功能性偶联的研究

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员