DASES 2022 任务4 声音事件探测变异器和框架模型混合系统 (A Hybrid System of Sound Event Detection Transformer and Frame-wise Model for DCASE 2022 Task 4)

In this paper, we describe in detail our system for DCASE 2022 Task4. The system combines two considerably different models: an end-to-end Sound Event Detection Transformer (SEDT) and a frame-wise model, Metric Learning and Focal Loss CNN (MLFL-CNN). The former is an event-wise model which learns event-level representations and predicts sound event categories and boundaries directly, while the latter is based on the widely adopted frame-classification scheme, under which each frame is classified into event categories and event boundaries are obtained by post-processing such as thresholding and smoothing. For SEDT, self-supervised pre-training using unlabeled data is applied, and semi-supervised learning is adopted by using an online teacher, which is updated from the student model using the Exponential Moving Average (EMA) strategy and generates reliable pseudo labels for weakly-labeled and unlabeled data. For the frame-wise model, the ICT-TOSHIBA system of DCASE 2021 Task 4 is used. Experimental results show that the hybrid system considerably outperforms either individual model and achieves psds1 of 0.420 and psds2 of 0.783 on the validation set without external data. The code is available at https://github.com/965694547/Hybrid-system-of-frame-wise-model-and-SEDT.

翻译：在本文中,我们详细描述我们的DCASE 2022 Table4系统。这个系统综合了两个大不相同的模式:端到端的无害事件探测变异器(SEDT)和一个框架型模型(Metric Learning and Colleases CNN),前者是一个了解事件级别表现并直接预测无害事件类别和界限的事件性模型,而后者则以广泛采用的框架分类办法为基础,根据这个办法,每个框架都分类为事件类别,事件界限通过后处理获得,例如门槛值和平稳。对于SEDT,应用了使用未贴标签数据进行自我监督的预培训,而使用在线教师采用半监督式学习,该方法根据学生模型更新,使用 " 指数移动平均 " (EMA)战略,为标签薄弱和无标签的数据生成可靠的假标签标签。对于框架型模型而言,DCASE 20任务4的信通技术-TOSHB系统由后处理获得。实验结果显示,混合系统大大超越了使用无标签的数据模型和0.420的单个模型,并在外部数据中实现了0.84/SERBs pd。

相关内容

MoDELS

关注 43

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

【CVPR 2022】采用稀疏Transformer的单步法三维物体检测器，Embracing Single Stride 3D Object Detector with Sparse Transformer

专知会员服务

5+阅读 · 2022年3月12日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日