In this work, we present the development of a new database, namely Sound Localization and Classification (SLoClas) corpus, for studying and analyzing sound localization and classification. The corpus contains a total of 23.27 hours of data recorded using a 4-channel microphone array. 10 classes of sounds are played over a loudspeaker at 1.5 meters distance from the array by varying the Direction-of-Arrival (DoA) from 1 degree to 360 degree at an interval of 5 degree. To facilitate the study of noise robustness, 6 types of outdoor noise are recorded at 4 DoAs, using the same devices. Moreover, we propose a baseline method, namely Sound Localization and Classification Network (SLCnet) and present the experimental results and analysis conducted on the collected SLoClas database. We achieve the accuracy of 95.21% and 80.01% for sound localization and classification, respectively. We publicly release this database and the source code for research purpose.
翻译:在这项工作中,我们提出开发一个新的数据库,即 " 稳妥本地化和分类(SLoClas) ",用于研究和分析稳妥本地化和分类;《保护伞》载有使用4个声道麦克风阵列记录的总共23.27小时数据;在距离阵列1.5米距离的扩音器上播放10类声音;抵达方向(DoA)从1度到360度不等,间隔为5度;为便利噪音稳健性研究,使用同样的设备,在4个DoAs记录了6种户外噪音;此外,我们提出了一种基线方法,即稳妥本地化和分类网络(SLCnet),并介绍了在收集到的SLoClas数据库上进行的实验结果和分析;我们分别实现了95.21%和80.01%的准确度,用于本地化和分类;我们公开公布该数据库和源代码供研究之用。