This paper introduces SINGA:PURA, a strongly labelled polyphonic urban sound dataset with spatiotemporal context. The data were collected via several recording units deployed across Singapore as a part of a wireless acoustic sensor network. These recordings were made as part of a project to identify and mitigate noise sources in Singapore, but also possess a wider applicability to sound event detection, classification, and localization. This paper introduces an accompanying hierarchical label taxonomy, which has been designed to be compatible with other existing datasets for urban sound tagging while also able to capture sound events unique to the Singaporean context. This paper details the data collection, annotation, and processing methodologies for the creation of the dataset. We further perform exploratory data analysis and include the performance of a baseline model on the dataset as a benchmark.
翻译:本文介绍SINGA:PURA,这是一个贴有高度标签的具有时空背景的多声都市声音数据集;这些数据是通过作为无线声传感网络的一部分部署在新加坡各地的几个录音单位收集的;这些录音是查明和减少新加坡噪音源的项目的一部分,但也对健全的事件探测、分类和本地化具有更广泛的适用性;本文介绍一个伴随的等级标签分类,设计该分类与其他现有的城市声音标签数据集兼容,同时能够捕捉新加坡特有的声音事件;本文详细介绍了建立数据集的数据收集、注解和处理方法;我们进一步进行探索性数据分析,并将数据集基线模型的性能列为基准。