In their everyday life, the speech recognition performance of human listeners is influenced by diverse factors, such as the acoustic environment, the talker and listener positions, possibly impaired hearing, and optional hearing devices. Prediction models come closer to considering all required factors simultaneously to predict the individual speech recognition performance in complex acoustic environments. While such predictions may still not be sufficiently accurate for serious applications, they can already be performed and demand an accessible representation. In this contribution, an interactive representation of speech recognition performance is proposed, which focuses on the listeners head orientation and the spatial dimensions of an acoustic scene. A exemplary modeling toolchain, including an acoustic rendering model, a hearing device model, and a listener model, was used to generate a data set for demonstration purposes. Using the spatial speech recognition maps to explore this data set demonstrated the suitability of the approach to observe possibly relevant behavior. The proposed representation provides a suitable target to compare and validate different modeling approaches in ecologically relevant contexts. Eventually, it may serve as a tool to use validated prediction models in the design of spaces and devices which take speech communication into account.
翻译:在日常生活中,人类听众的语音识别性能受到多种因素的影响,如声学环境、谈话者和听众姿势、听力可能受损、以及选择性听力装置等。预测模型更接近于同时考虑所有必要的因素,以预测复杂的声学环境中个人语音识别性能。虽然这种预测对于严肃应用来说可能仍然不够准确,但可以进行,并要求一种无障碍的表述。在这一贡献中,提出了语音识别性能的互动式表述,侧重于听众的头部方向和声学场的空间维度。一个模范工具链,包括声学投影模型、听力装置模型和听力模型,用来生成一套用于示范目的的数据集。利用空间语音识别图来探索这一数据集,显示了观察可能相关行为的方法的适宜性。拟议的表述提供了一个适当的目标,用以比较和验证生态相关情况下的不同模型。最后,它可以作为一种工具,用于在设计顾及语音通信的空间和装置时使用经过验证的预测模型。