DCASE 2021 任务3:多声声音事件定位和探测的分光同步特征 (DCASE 2021 Task 3: Spectrotemporally-aligned Features for Polyphonic Sound Event Localization and Detection) - 专知论文

会员服务 ·

0

估计/估计量 · 协方差矩阵 · Performer · MoDELS · 规范化的 ·

2021 年 6 月 29 日

DCASE 2021 Task 3: Spectrotemporally-aligned Features for Polyphonic Sound Event Localization and Detection

翻译：DCASE 2021 任务3:多声声音事件定位和探测的分光同步特征

Thi Ngoc Tho Nguyen,Karn Watcharasupat,Ngoc Khanh Nguyen,Douglas L. Jones,Woon Seng Gan

from arxiv, 5 pages, Technical Report for DCASE 2021 Challenge Task 3

Sound event localization and detection consists of two subtasks which are sound event detection and direction-of-arrival estimation. While sound event detection mainly relies on time-frequency patterns to distinguish different sound classes, direction-of-arrival estimation uses magnitude or phase differences between microphones to estimate source directions. Therefore, it is often difficult to jointly train these two subtasks simultaneously. We propose a novel feature called spatial cue-augmented log-spectrogram (SALSA) with exact time-frequency mapping between the signal power and the source direction-of-arrival. The feature includes multichannel log-spectrograms stacked along with the estimated direct-to-reverberant ratio and a normalized version of the principal eigenvector of the spatial covariance matrix at each time-frequency bin on the spectrograms. Experimental results on the DCASE 2021 dataset for sound event localization and detection with directional interference showed that the deep learning-based models trained on this new feature outperformed the DCASE challenge baseline by a large margin. We combined several models with slightly different architectures that were trained on the new feature to further improve the system performances for the DCASE sound event localization and detection challenge.

翻译：正确事件本地化和探测由两个子任务组成,它们是健全的事件探测和抵达方向估计。健全的事件探测主要依靠时间频率模式来区分不同的声音类别,而抵达方向估计则使用麦克风之间的大小或相位差异来估计源方向。因此,往往难以同时对这两个子任务进行同步培训。我们提出了一个新颖的特征,即空间暗示增强的日志谱(SALSA),在信号功率和抵达源方向之间精确的时间频率绘图。功能包括多频道日志谱,与估计的直对视比率一起堆叠在一起,以及每个时频中每个时频中空间常变异矩阵主机的正常版本,用于光谱图。DCASE 2021 数据集的实验结果显示,就这个新特征所训练的深层次学习模型比DCASE挑战基线大边缘。我们将几个模型与稍有差异的图像元件组合起来,用于对新特征检测系统进行微不同的测试。我们把一些地方模型与地方性能模型结合起来,用来改进了对新特征的探测系统。

0

相关内容

估计/估计量

估计/估计量

【ETH】最新《几何数据分析》2020课程，附PPT下载

专知会员服务

45+阅读 · 2020年12月18日

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

专知会员服务

44+阅读 · 2020年11月2日

【CVPR2020】强化特征点，Reinforced Feature Points: Optimizing Feature Detection and Description for a High-Level Task

【CVPR2020】强化特征点，Reinforced Feature Points: Optimizing Feature Detection and Description for a High-Level Task

专知会员服务

49+阅读 · 2020年2月25日

【ECML-PKDD 2019】多维时间序列和事件日志的模式挖掘和异常检测框架（A framework for pattern mining and anomalydetection in multi-dimensional time series andevent logs）

【ECML-PKDD 2019】多维时间序列和事件日志的模式挖掘和异常检测框架（A framework for pattern mining and anomalydetection in multi-dimensional time series andevent logs）

专知会员服务

38+阅读 · 2019年12月1日

【ECML-PKDD 2019】用于处理多维语义轨迹和预测未来语义位置的多通道卷积神经网络（Multi-Channel Convolutional Neural Networks for Handling Multi-Dimensional Semantic Trajectories and Predicting Future Semantic Locations）

【ECML-PKDD 2019】用于处理多维语义轨迹和预测未来语义位置的多通道卷积神经网络（Multi-Channel Convolutional Neural Networks for Handling Multi-Dimensional Semantic Trajectories and Predicting Future Semantic Locations）

专知会员服务

7+阅读 · 2019年12月1日

【变分推断课件】Lectures on Variational Inference：Statistical Analysis of Variational Approximations（附带pdf）

【变分推断课件】Lectures on Variational Inference：Statistical Analysis of Variational Approximations（附带pdf）

专知会员服务

16+阅读 · 2019年11月30日

【AAAI2020】实体关系联合抽取的编码器-解码器结构的有效建模（ Effective Modeling of Encoder-Decoder Architecture for Joint Entity and Relation Extraction）

【AAAI2020】实体关系联合抽取的编码器-解码器结构的有效建模（ Effective Modeling of Encoder-Decoder Architecture for Joint Entity and Relation Extraction）

专知会员服务

53+阅读 · 2019年11月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

局部学习的特征选择：Local-Learning-Based Feature Selection

局部学习的特征选择：Local-Learning-Based Feature Selection

我爱读PAMI

14+阅读 · 2019年9月20日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

计算机 | CCF推荐期刊专刊信息5条

计算机 | CCF推荐期刊专刊信息5条

Call4Papers

3+阅读 · 2019年4月10日

Call for Participation: Shared Tasks in NLPCC 2019

Call for Participation: Shared Tasks in NLPCC 2019

中国计算机学会

5+阅读 · 2019年3月22日

【泡泡一分钟】LIMO：激光和单目相机融合的视觉里程计

【泡泡一分钟】LIMO：激光和单目相机融合的视觉里程计

泡泡机器人SLAM

13+阅读 · 2019年1月16日

大数据 | 顶级SCI期刊专刊/国际会议信息7条

大数据 | 顶级SCI期刊专刊/国际会议信息7条

Call4Papers

10+阅读 · 2018年12月29日

人工智能 | COLT 2019等国际会议信息9条

人工智能 | COLT 2019等国际会议信息9条

Call4Papers

6+阅读 · 2018年9月21日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

Relation Networks for Object Detection 论文笔记

Relation Networks for Object Detection 论文笔记

统计学习与视觉计算组

16+阅读 · 2018年4月18日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

Novel Bayesian method for simultaneous detection of activation signatures and background connectivity for task fMRI data

Arxiv

0+阅读 · 2021年9月1日

MBDF-Net: Multi-Branch Deep Fusion Network for 3D Object Detection

Arxiv

0+阅读 · 2021年8月29日

Feature Extraction for Machine Learning-based Intrusion Detection in IoT Networks

Arxiv

0+阅读 · 2021年8月28日

Fast Rule-Based Clutter Detection in Automotive Radar Data

Fast Rule-Based Clutter Detection in Automotive Radar Data

Arxiv

0+阅读 · 2021年8月27日

Few-shot Learning for Multi-label Intent Detection

Arxiv

21+阅读 · 2020年10月11日

NAS-FPN: Learning Scalable Feature Pyramid Architecture for Object Detection

Arxiv

7+阅读 · 2019年4月16日

Reverse Attention for Salient Object Detection

Arxiv

11+阅读 · 2019年4月15日

Improved Speech Enhancement with the Wave-U-Net

Arxiv

8+阅读 · 2018年11月27日

A Convolutional Feature Map based Deep Network targeted towards Traffic Detection and Classification

Arxiv

3+阅读 · 2018年5月22日

Learning View-Specific Deep Networks for Person Re-Identification

Arxiv

7+阅读 · 2018年3月30日

VIP会员

文章信息

相关主题

估计/估计量

协方差矩阵

相关VIP内容

【ETH】最新《几何数据分析》2020课程，附PPT下载

专知会员服务

45+阅读 · 2020年12月18日

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

专知会员服务

44+阅读 · 2020年11月2日

【CVPR2020】强化特征点，Reinforced Feature Points: Optimizing Feature Detection and Description for a High-Level Task

【CVPR2020】强化特征点，Reinforced Feature Points: Optimizing Feature Detection and Description for a High-Level Task

专知会员服务

49+阅读 · 2020年2月25日

【ECML-PKDD 2019】多维时间序列和事件日志的模式挖掘和异常检测框架（A framework for pattern mining and anomalydetection in multi-dimensional time series andevent logs）

【ECML-PKDD 2019】多维时间序列和事件日志的模式挖掘和异常检测框架（A framework for pattern mining and anomalydetection in multi-dimensional time series andevent logs）

专知会员服务

38+阅读 · 2019年12月1日

【ECML-PKDD 2019】用于处理多维语义轨迹和预测未来语义位置的多通道卷积神经网络（Multi-Channel Convolutional Neural Networks for Handling Multi-Dimensional Semantic Trajectories and Predicting Future Semantic Locations）

【ECML-PKDD 2019】用于处理多维语义轨迹和预测未来语义位置的多通道卷积神经网络（Multi-Channel Convolutional Neural Networks for Handling Multi-Dimensional Semantic Trajectories and Predicting Future Semantic Locations）

专知会员服务

7+阅读 · 2019年12月1日

【变分推断课件】Lectures on Variational Inference：Statistical Analysis of Variational Approximations（附带pdf）

【变分推断课件】Lectures on Variational Inference：Statistical Analysis of Variational Approximations（附带pdf）

专知会员服务

16+阅读 · 2019年11月30日

【AAAI2020】实体关系联合抽取的编码器-解码器结构的有效建模（ Effective Modeling of Encoder-Decoder Architecture for Joint Entity and Relation Extraction）

【AAAI2020】实体关系联合抽取的编码器-解码器结构的有效建模（ Effective Modeling of Encoder-Decoder Architecture for Joint Entity and Relation Extraction）

专知会员服务

53+阅读 · 2019年11月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

热门VIP内容

开通专知VIP会员享更多权益服务

自动驾驶轨迹规划中的基础模型：进展综述与开放挑战

《用于提升多域战备的大型语言模型辅助场景生成器》报告

【斯坦福博士论文】为人类使用优化 AI 模型

国防领域人工智能规模化应用的理论与实践

相关资讯

局部学习的特征选择：Local-Learning-Based Feature Selection

局部学习的特征选择：Local-Learning-Based Feature Selection

我爱读PAMI

14+阅读 · 2019年9月20日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

计算机 | CCF推荐期刊专刊信息5条

计算机 | CCF推荐期刊专刊信息5条

Call4Papers

3+阅读 · 2019年4月10日

Call for Participation: Shared Tasks in NLPCC 2019

Call for Participation: Shared Tasks in NLPCC 2019

中国计算机学会

5+阅读 · 2019年3月22日

【泡泡一分钟】LIMO：激光和单目相机融合的视觉里程计

【泡泡一分钟】LIMO：激光和单目相机融合的视觉里程计

泡泡机器人SLAM

13+阅读 · 2019年1月16日

大数据 | 顶级SCI期刊专刊/国际会议信息7条

大数据 | 顶级SCI期刊专刊/国际会议信息7条

Call4Papers

10+阅读 · 2018年12月29日

人工智能 | COLT 2019等国际会议信息9条

人工智能 | COLT 2019等国际会议信息9条

Call4Papers

6+阅读 · 2018年9月21日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

Relation Networks for Object Detection 论文笔记

Relation Networks for Object Detection 论文笔记

统计学习与视觉计算组

16+阅读 · 2018年4月18日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

相关论文

Novel Bayesian method for simultaneous detection of activation signatures and background connectivity for task fMRI data

Arxiv

0+阅读 · 2021年9月1日

MBDF-Net: Multi-Branch Deep Fusion Network for 3D Object Detection

Arxiv

0+阅读 · 2021年8月29日

Feature Extraction for Machine Learning-based Intrusion Detection in IoT Networks

Arxiv

0+阅读 · 2021年8月28日

Fast Rule-Based Clutter Detection in Automotive Radar Data

Fast Rule-Based Clutter Detection in Automotive Radar Data

Arxiv

0+阅读 · 2021年8月27日

Few-shot Learning for Multi-label Intent Detection

Arxiv

21+阅读 · 2020年10月11日

NAS-FPN: Learning Scalable Feature Pyramid Architecture for Object Detection

Arxiv

7+阅读 · 2019年4月16日

Reverse Attention for Salient Object Detection

Arxiv

11+阅读 · 2019年4月15日

Improved Speech Enhancement with the Wave-U-Net

Arxiv

8+阅读 · 2018年11月27日

A Convolutional Feature Map based Deep Network targeted towards Traffic Detection and Classification

Arxiv

3+阅读 · 2018年5月22日

Learning View-Specific Deep Networks for Person Re-Identification

Arxiv

7+阅读 · 2018年3月30日

微信扫码咨询专知VIP会员