Noise is one of the primary quality-of-life issues in urban environments. In addition to annoyance, noise negatively impacts public health and educational performance. While low-cost sensors can be deployed to monitor ambient noise levels at high temporal resolutions, the amount of data they produce and the complexity of these data pose significant analytical challenges. One way to address these challenges is through machine listening techniques, which are used to extract features in attempts to classify the source of noise and understand temporal patterns of a city's noise situation. However, the overwhelming number of noise sources in the urban environment and the scarcity of labeled data makes it nearly impossible to create classification models with large enough vocabularies that capture the true dynamism of urban soundscapes In this paper, we first identify a set of requirements in the yet unexplored domain of urban soundscape exploration. To satisfy the requirements and tackle the identified challenges, we propose Urban Rhapsody, a framework that combines state-of-the-art audio representation, machine learning, and visual analytics to allow users to interactively create classification models, understand noise patterns of a city, and quickly retrieve and label audio excerpts in order to create a large high-precision annotated database of urban sound recordings. We demonstrate the tool's utility through case studies performed by domain experts using data generated over the five-year deployment of a one-of-a-kind sensor network in New York City.
翻译:城市环境的主要生活质量问题是噪音问题之一。除了烦恼外,噪音对公众健康和教育绩效也有负面影响。尽管可以部署低成本传感器,以高时空分辨率监测环境噪音水平,但其生成的数据数量和这些数据的复杂性构成重大分析挑战。应对这些挑战的方法之一是机器监听技术,用来提取用于对噪音来源进行分类和了解城市噪音状况时间模式的特征。然而,城市环境中大量噪音源以及标签数据稀缺,使得几乎不可能创建具有足够大数字词汇的分类模型,以捕捉城市声景的真正活力。在本文件中,我们首先确定尚未探索的城市声景勘探领域的一系列要求。为满足要求和应对已查明的挑战,我们建议城市听力技术是一个框架,将最新音频代表、机器学习和视觉解析结合起来,使用户能够互动创建分类模型,了解城市的噪音模式,并迅速检索和标签城市声景景景观模型。我们通过一个高频域数据库,通过一个高频数据库,通过一个高频数据库,通过一个高频数据库,通过一个高频数据库,通过一个高频数据库,展示一个高频版本。