This paper introduces a novel dataset for polyphonic sound event detection in urban sound monitoring use-cases. Based on isolated sounds taken from the FSD50k dataset, 20,000 polyphonic soundscapes are synthesized with sounds being randomly positioned in the stereo panorama using different loudness levels. The paper gives a detailed discussion of possible application scenarios, explains the dataset generation process in detail, and discusses current limitations of the proposed USM-SED dataset.
翻译:本文介绍了用于在城市声音监测使用案例中检测多声传声事件的新数据集。 根据从 FSD50k 数据集中分离的声音,将20,000个多声传声合成成声音,在立体全景中使用不同的响度随机定位。 本文详细讨论了可能的应用程序设想,详细解释了数据集生成过程,并讨论了拟议的USM-SED数据集目前的局限性。