音响场景复杂性和视觉场景演示对虚拟视听环境中的听觉感的影响 (Effect of acoustic scene complexity and visual scene representation on auditory perception in virtual audio-visual environments)

In daily life, social interaction and acoustic communication often take place in complex acoustic environments (CAE) with a variety of interfering sounds and reverberation. For hearing research and evaluation of hearing systems simulated CAEs using virtual reality techniques have gained interest in the context of ecologically validity. In the current study, the effect of scene complexity and visual representation of the scene on psychoacoustic measures like sound source location, distance perception, loudness, speech intelligibility, and listening effort in a virtual audio-visual environment was investigated. A 3-dimensional, 86-channel loudspeaker array was used to render the sound field in combination with or without a head-mounted display (HMD) to create an immersive stereoscopic visual representation of the scene. The scene consisted of a ring of eight (virtual) loudspeakers which played a target speech stimulus and non-sense speech interferers in several spatial conditions. Either an anechoic (snowy outdoor scenery) or echoic environment (loft apartment) with a reverberation time (T60) of about 1.5 s was simulated. In addition to varying the number of interferers, scene complexity was varied by assessing the psychoacoustic measures in isolated consecutive measurements or simultaneously. Results showed no significant effect of wearing the HMD on the data. Loudness and distance perception showed significantly different results when they were measured simultaneously instead of consecutively in isolation. The advantage of the suggested setup is that it can be directly transferred to a corresponding real room, enabling a 1:1 comparison and verification of the perception experiments in the real and virtual environment.

翻译：在日常生活中,社会互动和声学交流往往发生在复杂的声学环境中,有各种干扰声音和反响。对于使用虚拟现实技术模拟的听力系统的听力研究和评价来说,使用虚拟现实技术对模拟CAE进行模拟CAE的听力研究和评价,在生态有效性方面引起了人们的兴趣。在目前的研究中,场景复杂性和场景的视觉表现对声学措施的影响,如声源位置、距离感知、声响、语音感知和在虚拟视听环境中的听力努力。使用了3维的86个频道高音器阵列,使声音场与一个头部显示的显示(HMD)结合或不进行。在对场景进行感知感知的感知上,由8个(虚拟的)扬声器环组成,在若干空间条件下发挥了目标言力刺激和非感知性言语干扰力。要么是静态(室的声学优势)或声响环境(软化公寓)的反动时间(T60),大约1.5个虚拟虚拟显示的视觉显示场景的视觉视觉显示的视觉显示显示,而连续测测测测测测度的测为显著的测测距的测距,其测测测测为不同的测测测为不同的测测测测测测测为不同的结果。