We study few-shot acoustic event detection (AED) in this paper. Few-shot learning enables detection of new events with very limited labeled data. Compared to other research areas like computer vision, few-shot learning for audio recognition has been under-studied. We formulate few-shot AED problem and explore different ways of utilizing traditional supervised methods for this setting as well as a variety of meta-learning approaches, which are conventionally used to solve few-shot classification problem. Compared to supervised baselines, meta-learning models achieve superior performance, thus showing its effectiveness on generalization to new audio events. Our analysis including impact of initialization and domain discrepancy further validate the advantage of meta-learning approaches in few-shot AED.
翻译:在本文中,我们研究了几小片声学事件探测(AED) 。 几小片的学习使得能够探测带有非常有限的标签数据的新事件。 与计算机视觉等其他研究领域相比,关于音频识别的微片学习被研究不足。 我们提出了几小片AED问题,并探索了使用传统监督方法解决这种环境的不同方法以及各种元学习方法,这些方法通常用于解决微片分类问题。 与受监督的基线相比,元学习模型取得了优异的性能,从而显示了其对新音频事件的一般化效果。 我们的分析,包括初始化和域差异的影响,进一步证实了在少数片的AED中,元学习方法的优势。