This work introduces a feature extracted from stereophonic/binaural audio signals aiming to represent a measure of perceived quality degradation in processed spatial auditory scenes. The feature extraction technique is based on a simplified stereo signal model considering auditory events positioned towards a given direction in the stereo field using amplitude panning (AP) techniques. We decompose the stereo signal into a set of directional signals for given AP values in the Short-Time Fourier Transform domain and calculate their overall loudness to form a directional loudness representation or maps. Then, we compare directional loudness maps of a reference signal and a deteriorated version to derive a distortion measure aiming to describe the associated perceived degradation scores reported in listening tests. The measure is then tested on an extensive listening test database with stereo signals processed by state-of-the-art perceptual audio codecs using non waveform-preserving techniques such as bandwidth extension and joint stereo coding, known for presenting a challenge to existing quality predictors. Results suggest that the derived distortion measure can be incorporated as an extension to existing automated perceptual quality assessment algorithms for improving prediction on spatially coded audio signals.
翻译:这项工作引入了从立体声/声波信号中提取的特征,目的是代表对已处理的空间听觉场景质量退化的量度; 特征提取技术基于一个简化立体信号模型,考虑使用振幅分布(AP)技术在立体场向特定方向定位的听觉事件; 我们将立体信号分解成一套短时傅里叶变形域中特定AP值的方向信号,并计算其总体响度以形成方向性响度表示或地图; 然后,我们比较参考信号的方向性响亮地图和变坏的版本,以得出一种扭曲措施,目的是描述在监听测试中所报告的相关觉退化分数; 然后,该措施在一个广泛的监听测试数据库中测试,用由最先进的感知性音频调调调调调调解码器处理的音频信号,例如带宽扩展和联合立体调调调调制等技术,已知对现有质量预测器提出了挑战; 结果表明,衍生的扭曲措施可以纳入现有的自动感性质量评估算法,作为改进空间编码音频信号预测的延伸。