RaDur:目标稳妥探测参考觉测和时间-时间-温度强网 (RaDur: A Reference-aware and Duration-robust Network for Target Sound Detection) - 专知论文

会员服务 ·

0

TSD · Networking · Performer · Extensibility · RetinaNet ·

2022 年 4 月 5 日

RaDur: A Reference-aware and Duration-robust Network for Target Sound Detection

翻译：RaDur:目标稳妥探测参考觉测和时间-时间-温度强网

Dongchao Yang,Helin Wang,Zhongjie Ye,Yuexian Zou,Wenwu Wang

from arxiv, submitted to interspeech2022

Target sound detection (TSD) aims to detect the target sound from a mixture audio given the reference information. Previous methods use a conditional network to extract a sound-discriminative embedding from the reference audio, and then use it to detect the target sound from the mixture audio. However, the network performs much differently when using different reference audios (e.g. performs poorly for noisy and short-duration reference audios), and tends to make wrong decisions for transient events (i.e. shorter than $1$ second). To overcome these problems, in this paper, we present a reference-aware and duration-robust network (RaDur) for TSD. More specifically, in order to make the network more aware of the reference information, we propose an embedding enhancement module to take into account the mixture audio while generating the embedding, and apply the attention pooling to enhance the features of target sound-related frames and weaken the features of noisy frames. In addition, a duration-robust focal loss is proposed to help model different-duration events. To evaluate our method, we build two TSD datasets based on UrbanSound and Audioset. Extensive experiments show the effectiveness of our methods.

翻译：目标声音探测( TSD) 旨在检测来自混合音频的目标声音, 并参考信息。先前的方法是使用一个有条件的网络从参考音频中提取一个声音分解嵌入, 然后用它来检测混合音频的目标声音。但是, 网络使用不同的参考音频( 例如, 噪音和短时间隔参考音频表现差) 时表现的差别很大, 并且往往会为瞬时事件做出错误的决定( 短于 $1 秒 ) 。为了克服这些问题, 我们在本文件中为 TED 提出了一个参考- 觉和期限- robust 网络( RADur ) 。更具体地说, 为了让网络更了解参考信息, 我们提议了一个嵌入增强模块, 以考虑到混合物音频同时生成嵌入, 并且将注意力集中到加强目标声频框架的特征并削弱噪音框架的特征。此外, 提议使用一个期限- robust 焦点损失来帮助模拟不同宽度事件。为了评估我们的方法, 我们建两个基于 UrsuramSoundset 的实验方法的。

0

相关内容

TSD

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

专知会员服务

15+阅读 · 2019年10月23日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

31+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

53+阅读 · 2019年9月29日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【论文推荐】最新七篇图像分割相关论文—Attention U-Net、对抗结构匹配损失、卷积CRFs、对抗样本、弱监督分割

【论文推荐】最新七篇图像分割相关论文—Attention U-Net、对抗结构匹配损失、卷积CRFs、对抗样本、弱监督分割

专知

19+阅读 · 2018年5月31日

【论文推荐】最新六篇对抗自编码器相关论文—多尺度网络节点表示、生成对抗自编码、逆映射、Wasserstein、条件对抗、去噪

【论文推荐】最新六篇对抗自编码器相关论文—多尺度网络节点表示、生成对抗自编码、逆映射、Wasserstein、条件对抗、去噪

专知

20+阅读 · 2018年4月7日

最新5篇生成对抗网络相关论文推荐—FusedGAN、DeblurGAN、AdvGAN、CipherGAN、MMD GANS

最新5篇生成对抗网络相关论文推荐—FusedGAN、DeblurGAN、AdvGAN、CipherGAN、MMD GANS

专知

23+阅读 · 2018年1月18日

【推荐】YOLO实时目标检测(6fps)

【推荐】YOLO实时目标检测(6fps)

机器学习研究会

20+阅读 · 2017年11月5日

套子代数的Hochschild上同调及套的分类

国家自然科学基金

3+阅读 · 2014年12月31日

基于结构与功能连接网络的自闭症脑重组发育研究及生物标记筛选

国家自然科学基金

0+阅读 · 2013年12月31日

支持多种计算与数据共享的编程框架研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于e-Science的民族信息资源融合与语义检索研究

国家自然科学基金

0+阅读 · 2012年12月31日

Cystatin B缺失与Prion疾病自噬作用机制的研究

国家自然科学基金

0+阅读 · 2011年12月31日

融合网络特征的文本观点挖掘

国家自然科学基金

2+阅读 · 2011年12月31日

《物理》期刊

国家自然科学基金

1+阅读 · 2009年12月31日

中国对虾IGF-ⅠIGF-Ⅱ#22522;因序列及其多态性与生长性状的关联分析

国家自然科学基金

0+阅读 · 2009年12月31日

Unscented卡尔曼滤波算法及其在通信中的应用

国家自然科学基金

0+阅读 · 2008年12月31日

抗原特异性和非抗原特异性CD4+CD25+ Treg细胞对Th1细胞分化、效应功能和记忆Th1细胞形成的影响

国家自然科学基金

0+阅读 · 2008年12月31日

Composite Anomaly Detection via Hierarchical Dynamic Search

Arxiv

0+阅读 · 2022年4月20日

VSEGAN: Visual Speech Enhancement Generative Adversarial Network

Arxiv

0+阅读 · 2022年4月20日

Revisiting Consistency Regularization for Semi-supervised Change Detection in Remote Sensing Images

Revisiting Consistency Regularization for Semi-supervised Change Detection in Remote Sensing Images

Arxiv

0+阅读 · 2022年4月18日

Preference Enhanced Social Influence Modeling for Network-Aware Cascade Prediction

Arxiv

0+阅读 · 2022年4月18日

Meta-Learning with Dynamic-Memory-Based Prototypical Network for Few-Shot Event Detection

Arxiv

20+阅读 · 2019年10月25日

Cross-Modal Self-Attention Network for Referring Image Segmentation

Cross-Modal Self-Attention Network for Referring Image Segmentation

Arxiv

18+阅读 · 2019年4月9日

Phase-aware Speech Enhancement with Deep Complex U-Net

Phase-aware Speech Enhancement with Deep Complex U-Net

Arxiv

15+阅读 · 2019年3月7日

SINet: A Scale-insensitive Convolutional Neural Network for Fast Vehicle Detection

Arxiv

11+阅读 · 2018年4月2日

Learning over Knowledge-Base Embeddings for Recommendation

Arxiv

23+阅读 · 2018年3月22日

MSDNN: Multi-Scale Deep Neural Network for Salient Object Detection

Arxiv

21+阅读 · 2018年1月12日

VIP会员

文章信息

相关主题

相关VIP内容

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

专知会员服务

15+阅读 · 2019年10月23日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

31+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

53+阅读 · 2019年9月29日

热门VIP内容

开通专知VIP会员享更多权益服务

《人与智能体在系统工程建模语言V2任务中的性能表现：基于用户中心化的评估方法》308页

《数据安全国家标准体系（2025版）》征求意见稿

AlphaMosaic：人工智能赋能的作战管理系统

《军事行动中通信平台的战略价值：提升战术效能与作战优势》

相关资讯

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【论文推荐】最新七篇图像分割相关论文—Attention U-Net、对抗结构匹配损失、卷积CRFs、对抗样本、弱监督分割

【论文推荐】最新七篇图像分割相关论文—Attention U-Net、对抗结构匹配损失、卷积CRFs、对抗样本、弱监督分割

专知

19+阅读 · 2018年5月31日

【论文推荐】最新六篇对抗自编码器相关论文—多尺度网络节点表示、生成对抗自编码、逆映射、Wasserstein、条件对抗、去噪

【论文推荐】最新六篇对抗自编码器相关论文—多尺度网络节点表示、生成对抗自编码、逆映射、Wasserstein、条件对抗、去噪

专知

20+阅读 · 2018年4月7日

最新5篇生成对抗网络相关论文推荐—FusedGAN、DeblurGAN、AdvGAN、CipherGAN、MMD GANS

最新5篇生成对抗网络相关论文推荐—FusedGAN、DeblurGAN、AdvGAN、CipherGAN、MMD GANS

专知

23+阅读 · 2018年1月18日

【推荐】YOLO实时目标检测(6fps)

【推荐】YOLO实时目标检测(6fps)

机器学习研究会

20+阅读 · 2017年11月5日

相关论文

Composite Anomaly Detection via Hierarchical Dynamic Search

Arxiv

0+阅读 · 2022年4月20日

VSEGAN: Visual Speech Enhancement Generative Adversarial Network

Arxiv

0+阅读 · 2022年4月20日

Revisiting Consistency Regularization for Semi-supervised Change Detection in Remote Sensing Images

Revisiting Consistency Regularization for Semi-supervised Change Detection in Remote Sensing Images

Arxiv

0+阅读 · 2022年4月18日

Preference Enhanced Social Influence Modeling for Network-Aware Cascade Prediction

Arxiv

0+阅读 · 2022年4月18日

Meta-Learning with Dynamic-Memory-Based Prototypical Network for Few-Shot Event Detection

Arxiv

20+阅读 · 2019年10月25日

Cross-Modal Self-Attention Network for Referring Image Segmentation

Cross-Modal Self-Attention Network for Referring Image Segmentation

Arxiv

18+阅读 · 2019年4月9日

Phase-aware Speech Enhancement with Deep Complex U-Net

Phase-aware Speech Enhancement with Deep Complex U-Net

Arxiv

15+阅读 · 2019年3月7日

SINet: A Scale-insensitive Convolutional Neural Network for Fast Vehicle Detection

Arxiv

11+阅读 · 2018年4月2日

Learning over Knowledge-Base Embeddings for Recommendation

Arxiv

23+阅读 · 2018年3月22日

MSDNN: Multi-Scale Deep Neural Network for Salient Object Detection

Arxiv

21+阅读 · 2018年1月12日

相关基金

套子代数的Hochschild上同调及套的分类

国家自然科学基金

3+阅读 · 2014年12月31日

基于结构与功能连接网络的自闭症脑重组发育研究及生物标记筛选

国家自然科学基金

0+阅读 · 2013年12月31日

支持多种计算与数据共享的编程框架研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于e-Science的民族信息资源融合与语义检索研究

国家自然科学基金

0+阅读 · 2012年12月31日

Cystatin B缺失与Prion疾病自噬作用机制的研究

国家自然科学基金

0+阅读 · 2011年12月31日

融合网络特征的文本观点挖掘

国家自然科学基金

2+阅读 · 2011年12月31日

《物理》期刊

国家自然科学基金

1+阅读 · 2009年12月31日

中国对虾IGF-ⅠIGF-Ⅱ#22522;因序列及其多态性与生长性状的关联分析

国家自然科学基金

0+阅读 · 2009年12月31日

Unscented卡尔曼滤波算法及其在通信中的应用

国家自然科学基金

0+阅读 · 2008年12月31日

抗原特异性和非抗原特异性CD4+CD25+ Treg细胞对Th1细胞分化、效应功能和记忆Th1细胞形成的影响

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员