We present a robust classification approach for avian vocalization in complex and diverse soundscapes, achieving second place in the BirdCLEF2021 challenge. We illustrate how to make full use of pre-trained convolutional neural networks, by using an efficient modeling and training routine supplemented by novel augmentation methods. Thereby, we improve the generalization of weakly labeled crowd-sourced data to productive data collected by autonomous recording units. As such, we illustrate how to progress towards an accurate automated assessment of avian population which would enable global biodiversity monitoring at scale, impossible by manual annotation.
翻译:我们提出了在复杂和多样声音场景下对禽声学进行严格分类的方法,在鸟类CLEF2021挑战中位居第二。我们说明如何通过使用高效的模型模型和培训常规,辅之以新的扩增方法,充分利用预先训练的神经神经网络。因此,我们改进了将贴有微弱标签的众源数据归纳为自主记录单位收集的生产性数据。因此,我们说明了如何在对禽口进行准确的自动评估方面取得进展,通过人工注解,无法进行大规模的全球生物多样性监测。