Over the past few decades, computational methods have been developed to estimate perceptual audio quality. These methods, also referred to as objective quality measures, are usually developed and intended for a specific application domain. Because of their convenience, they are often used outside their original intended domain, even if it is unclear whether they provide reliable quality estimates in this case. This work studies the correlation of well-known state-of-the-art objective measures with human perceptual scores in two different domains: audio coding and source separation. The following objective measures are considered: fwSNRseg, dLLR, PESQ, PEAQ, POLQA, PEMO-Q, ViSQOLAudio, (SI-)BSSEval, PEASS, LKR-PI, 2f-model, and HAAQI. Additionally, a novel measure (SI-SA2f) is presented, based on the 2f-model and a BSSEval-based signal decomposition. We use perceptual scores from 7 listening tests about audio coding and 7 listening tests about source separation as ground-truth data for the correlation analysis. The results show that one method (2f-model) performs significantly better than the others on both domains and indicate that the dataset for training the method and a robust underlying auditory model are crucial factors towards a universal, domain-independent objective measure.
翻译:在过去几十年里,已经开发了计算方法来估计感知音质,这些方法(也称为客观质量措施)通常是为特定应用领域开发的,并打算用于特定应用领域。由于方便性,这些方法往往在原定领域之外使用,即使尚不清楚它们是否为本案提供了可靠的质量估计。这项工作研究了众所周知的先进客观措施与人类感知分数在两个不同领域的相互关系:音频编码和来源分离。考虑下列客观措施:fSNRseg、dLLR、PESQ、PEAQ、POLQA、PEMOQ、VisQOLAudio、(SI)BSESEval、PEASS、LKR-PI、2f-模型和HAQI。此外,根据2f-SSA2f模型和基于BSSEval的信号分解定位,提出了一个新的衡量标准。我们使用7项听力测试和7项关于源数据分的感知分数,作为地面和7项对源分离的测试,(SI-roduisty rodual)的精确度数据分析方法显示一个更精确的精确的对比方法。