与 Perceivers公司合作,以提案为基础,为语音和环境声音进行几发微小照片的 " 声音事件探测 " 。 (Proposal-based Few-shot Sound Event Detection for Speech and Environmental Sounds with Perceivers)

There are many important applications for detecting and localizing specific sound events within long, untrimmed documents including keyword spotting, medical observation, and bioacoustic monitoring for conservation. Deep learning techniques often set the state-of-the-art for these tasks. However, for some types of events, there is insufficient labeled data to train deep learning models. In this paper, we propose novel approaches to few-shot sound event detection utilizing region proposals and the Perceiver architecture, which is capable of accurately localizing sound events with very few examples of each class of interest. Motivated by a lack of suitable benchmark datasets for few-shot audio event detection, we generate and evaluate on two novel episodic rare sound event datasets: one using clips of celebrity speech as the sound event, and the other using environmental sounds. Our highest performing proposed few-shot approaches achieve 0.575 and 0.672 F1-score, respectively, with 5-shot 5-way tasks on these two datasets. These represent absolute improvements of 0.200 and 0.234 over strong proposal-free few-shot sound event detection baselines.

翻译：用于探测和定位特定声音事件的许多重要应用应用是长长的、未剪切的文件,包括关键词定位、医疗观察和生物声学监测,以进行保护。深层学习技术往往为这些任务设定了最先进的技术。然而,对于某些类型的事件,没有贴标签的数据来培训深层学习模式。在本文中,我们提出了利用区域建议和 Perceiver 结构对微小声音事件探测采用新颖方法,这种方法能够精确定位声音事件,而每一类感兴趣的例子很少。由于缺乏用于少量声音事件探测的适当基准数据集,我们产生并评估了两种新颖的稀有突发事件数据集:一种是使用名人讲话剪片作为声音事件,另一种是使用环境声音。我们最高级的拟议微小方法分别达到0.575和0.672 F1核心,在这两个数据集上完成5分的5分5分的5线任务。这代表了0.200和0.234的绝对改进,超过了没有建议的强的微小声音事件探测基线。

相关内容

小样本学习

关注 215

小样本学习（Few-Shot Learning，以下简称 FSL ）用于解决当可用的数据量比较少时，如何提升神经网络的性能。在 FSL 中，经常用到的一类方法被称为 Meta-learning。和普通的神经网络的训练方法一样，Meta-learning 也包含训练过程和测试过程，但是它的训练过程被称作 Meta-training 和 Meta-testing。

【硬核课】机器人学习课程，UT Austin朱玉可博士讲述自主机器人的人工智能与机器学习机器学习算法

专知会员服务

40+阅读 · 2020年9月21日

【ACL2020-浙大-微软】多轮对话推理数据集，MuTual: A Dataset for Multi-Turn Dialogue Reasoning

专知会员服务

37+阅读 · 2020年4月10日

元迁移学习的小样本学习，Meta-transfer Learning for Few-shot Learning

专知会员服务

159+阅读 · 2020年2月29日

【微软雷德蒙研究院】小样本自然语言生成，Few-shot Natural Language Generation for Task-Oriented Dialog

专知会员服务

33+阅读 · 2020年2月29日