Polyphonic sound event localization and detection (SELD) has many practical applications in acoustic sensing and monitoring. However, the development of real-time SELD has been limited by the demanding computational requirement of most recent SELD systems. In this work, we introduce SALSA-Lite, a fast and effective feature for polyphonic SELD using microphone array inputs. SALSA-Lite is a lightweight variation of a previously proposed SALSA feature for polyphonic SELD. SALSA, which stands for Spatial Cue-Augmented Log-Spectrogram, consists of multichannel log-spectrograms stacked channelwise with the normalized principal eigenvectors of the spectrotemporally corresponding spatial covariance matrices. In contrast to SALSA, which uses eigenvector-based spatial features, SALSA-Lite uses normalized inter-channel phase differences as spatial features, allowing a 30-fold speedup compared to the original SALSA feature. Experimental results on the TAU-NIGENS Spatial Sound Events 2021 dataset showed that the SALSA-Lite feature achieved competitive performance compared to the full SALSA feature, and significantly outperformed the traditional feature set of multichannel log-mel spectrograms with generalized cross-correlation spectra. Specifically, using SALSA-Lite features increased localization-dependent F1 score and class-dependent localization recall by 15% and 5%, respectively, compared to using multichannel log-mel spectrograms with generalized cross-correlation spectra.
翻译:SALSA-Lite是先前提议的 SALSA 用于多功能 SELLD 的 SALSA 特性的轻量变异。 SALSA 是一个用于多功能 SELD 的 SALSA 特性,它代表着空间Cue-Augment-多频谱仪-Spectrotrogram, 由最近SELD系统要求的严格计算要求组成的实时 SELD 发展受到限制。在这项工作中,我们采用麦克风阵列输入的多功能SALSA-Lite,这是多功能阵列 SALSA 中一个快速有效的功能。 SALSA 的SALSA 特性是多功能,与原始的SALSA 直径直流分级数相比,SA-NIGENS Sectrocrocrographram 由多功能堆叠叠合而成的频道,与光谱-SALLLSA 和SAL-SAL-SAL 级平流-SAL-SAL-SAL-CSA 特点相比,与SAL-SAL-C-SAL-C-SAL-SAL-SAL-CL-SAL-C-C-C-SAL-C-C-C-Cl-SL-SL-C-C-C-SL-C-C-C-C-C-C-SL-SL-SL-C-C-C-C-C-SL-SL-C-SL-SL-SL-SL-C-C-C-C-SL-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-