Visual to auditory sensory substitution devices convert visual information into sound and can provide valuable assistance for blind people. Recent iterations of these devices rely on depth sensors. Rules for converting depth into sound (i.e. the sonifications) are often designed arbitrarily, with no strong evidence for choosing one over another. The purpose of this work is to compare and understand the effectiveness of five depth sonifications in order to assist the design process of future visual to auditory systems for blind people which rely on depth sensors. The frequency, amplitude and reverberation of the sound as well as the repetition rate of short high-pitched sounds and the signal-to-noise ratio of a mixture between pure sound and noise are studied. We conducted positioning experiments with twenty-eight sighted blindfolded participants. Stage 1 incorporates learning phases followed by depth estimation tasks. Stage 2 adds the additional challenge of azimuth estimation to the first stage's protocol. Stage 3 tests learning retention by incorporating a 10-minute break before re-testing depth estimation. The best depth estimates in stage 1 were obtained with the sound frequency and the repetition rate of beeps. In stage 2, the beep repetition rate yielded the best depth estimation and no significant difference was observed for the azimuth estimation. Results of stage 3 showed that the beep repetition rate was the easiest sonification to memorize. Based on statistical analysis of the results, we discuss the effectiveness of each sonification and compare with other studies that encode depth into sounds. Finally we provide recommendations for the design of depth encoding.
翻译:视觉到听觉感官替代设备将视觉信息转换为声音,对盲人提供有价值的帮助。最近的这些设备依赖于深度传感器。将深度转换为声音的规则(即声音化)通常是任意设计的,没有证据表明应选择其中之一。本文的目的是比较和了解五种声音化的有效性,以协助未来依赖深度传感器的盲人视觉到听觉系统的设计过程。研究声音的频率、振幅和混响,以及短暂高音的重复率和纯声音与噪声混合的信噪比。我们进行了28个带有蒙眼的视力正常的参与者的定位实验。第一阶段包括学习阶段,然后是深度估计任务。第二阶段在第一阶段的方案基础上增加了方位角估计的额外挑战。第三阶段在重新测试深度估计之前加入了10分钟的休息以测试记忆保持。第一阶段的最佳深度估计结果使用了声音频率和蜂鸣声的重复率。在第二阶段,蜂鸣声的重复率产生了最佳深度估计结果,方位角估计没有显著差异。第三阶段的结果显示,蜂鸣声的重复率是最容易记忆的声音化。基于结果的统计分析,我们讨论了每种声音化的有效性,并与其他将深度编码为声音的研究进行了比较。最后,我们提供了深度编码的设计建议。