Sound scene geotagging is a new topic of research which has evolved from acoustic scene classification. It is motivated by the idea of audio surveillance. Not content with only describing a scene in a recording, a machine which can locate where the recording was captured would be of use to many. In this paper we explore a series of common audio data augmentation methods to evaluate which best improves the accuracy of audio geotagging classifiers. Our work improves on the state-of-the-art city geotagging method by 23% in terms of classification accuracy.
翻译:声音场景地理标记是一个从声学场景分类演变而来的新研究课题。 它的动机是音频监视。 不满足于只描述记录中的场景, 一个可以定位记录地点的机器将会对许多人有用。 在本文中, 我们探索一系列通用音频数据增强方法, 来评估哪些方法能最好地提高音频地理标记分类的准确性。 我们的工作在分类精确性方面提高了城市最新地理标记方法的23% 。