微粒生物声学事件探测的计量学习 (Segment-level Metric Learning for Few-shot Bioacoustic Event Detection)

Few-shot bioacoustic event detection is a task that detects the occurrence time of a novel sound given a few examples. Previous methods employ metric learning to build a latent space with the labeled part of different sound classes, also known as positive events. In this study, we propose a segment-level few-shot learning framework that utilizes both the positive and negative events during model optimization. Training with negative events, which are larger in volume than positive events, can increase the generalization ability of the model. In addition, we use transductive inference on the validation set during training for better adaptation to novel classes. We conduct ablation studies on our proposed method with different setups on input features, training data, and hyper-parameters. Our final system achieves an F-measure of 62.73 on the DCASE 2022 challenge task 5 (DCASE2022-T5) validation set, outperforming the performance of the baseline prototypical network 34.02 by a large margin. Using the proposed method, our submitted system ranks 2nd in DCASE2022-T5. The code of this paper is fully open-sourced at https://github.com/haoheliu/DCASE_2022_Task_5.

翻译：少量生物声学事件探测是一个任务,它能探测到新声响的发生时间,并举几个例子。以往的方法采用标准学习方法,用不同声学类中贴有标签的部分建立潜在空间,也称为积极事件。在本研究中,我们提议了一个部分性微小的学习框架,利用模型优化期间的正面和负面事件; 利用负面事件进行的培训,其数量大于正面事件,可以提高模型的普及能力。此外,我们在为更好地适应新声学而进行培训期间,对验证集使用了转导推导法。我们用不同的输入功能、培训数据和超参数来进行关于我们拟议方法的研究。我们的最后系统在DCASASE 2022任务5(DCASE20_GISB_HAI5)上实现了62.73的F度测量法(DCASE2022-TASE_HODCSASE/HOUIASY5)的验证装置,大大超过基准原型网络34.02的性能。此外,我们提交的系统在DCSE20T5中排名第二位。本文的代码在http://gith_Tas_TASE20_TASASASASEASE. ASE. ASE. ASE. 完全开源。