Classical methods for acoustic scene mapping require the estimation of time difference of arrival (TDOA) between microphones. Unfortunately, TDOA estimation is very sensitive to reverberation and additive noise. We introduce an unsupervised data-driven approach that exploits the natural structure of the data. Our method builds upon local conformal autoencoders (LOCA) - an offline deep learning scheme for learning standardized data coordinates from measurements. Our experimental setup includes a microphone array that measures the transmitted sound source at multiple locations across the acoustic enclosure. We demonstrate that LOCA learns a representation that is isometric to the spatial locations of the microphones. The performance of our method is evaluated using a series of realistic simulations and compared with other dimensionality-reduction schemes. We further assess the influence of reverberation on the results of LOCA and show that it demonstrates considerable robustness.
翻译:古典声学场景绘图方法要求估计麦克风之间到达时间差(TDOA)。不幸的是,DOA估计对反响和添加噪音非常敏感。我们采用了一种不受监督的数据驱动方法,利用数据自然结构。我们的方法以当地自成一体自动电解仪为基础,这是一种从测量中学习标准化数据坐标的离线深层次学习计划。我们的实验装置包括一个麦克风阵列,测量声源在声屏的多个地点的传声源。我们证明,LOCA学会了与麦克风空间位置相异的表示。我们的方法的性能是通过一系列现实的模拟来评估的,并与其他维度降低计划进行比较。我们进一步评估了对LOCA结果的反校验影响,并表明它显示出相当强健性。