Sounds recorded with smartphones or IoT devices often have partially unreliable observations caused by clipping, wind noise, and completely missing parts due to microphone failure and packet loss in data transmission over the network. In this paper, we investigate the impact of the partially missing channels on the performance of acoustic scene classification using multichannel audio recordings, especially for a distributed microphone array. Missing observations cause not only losses of time-frequency and spatial information on sound sources but also a mismatch between a trained model and evaluation data. We thus investigate how a missing channel affects the performance of acoustic scene classification in detail. We also propose simple data augmentation methods for scene classification using multichannel observations with partially missing channels and evaluate the scene classification performance using the data augmentation methods.
翻译:在本文中,我们调查部分缺失的频道对使用多声道录音进行声学现场分类的效果的影响,特别是对分布式麦克风阵列的影响。缺失的观测不仅造成音频和空间信息损失,而且造成经过训练的模型和评估数据之间的不匹配。我们因此调查缺少的频道如何影响声学现场分类的详细性能。我们还提议使用部分缺失的多声道观测进行现场分类的简单数据增强方法,并利用数据扩增方法评估现场分类的性能。