In audio processing applications, phase retrieval (PR) is often performed from the magnitude of short-time Fourier transform (STFT) coefficients. Although PR performance has been observed to depend on the considered STFT parameters and audio data, the extent of this dependence has not been systematically evaluated yet. To address this, we studied the performance of three PR algorithms for various types of audio content and various STFT parameters such as redundancy, time-frequency ratio, and the type of window. The quality of PR was studied in terms of objective difference grade and signal-to-noise ratio of the STFT magnitude, to provide auditory- and signal-based quality assessments. Our results show that PR quality improved with increasing redundancy, with a strong relevance of the time-frequency ratio. The effect of the audio content was smaller but still observable. The effect of the window was only significant for one of the PR algorithms. Interestingly, for a good PR quality, each of the three algorithms required a different set of parameters, demonstrating the relevance of individual parameter sets for a fair comparison across PR algorithms. Based on these results, we developed guidelines for optimizing STFT parameters for a given application.
翻译:在音频处理应用程序中,阶段检索通常从短期Fourier变换系数(STFT)的临界值中进行。虽然观察到PR的性能取决于经过考虑的STFT参数和音频数据,但这种依赖性的程度尚未得到系统评估。为此,我们研究了三种类型音频内容的PR算法和各种STFT参数的性能,如冗余、时间-频率比率和窗口类型。PR的质量是从STFT数值的客观差异等级和信号-音频比角度研究的,以提供以听觉和信号为基础的质量评估。我们的结果显示,随着时间-频率比率的强烈相关性,PR的质量有了提高。音频内容的效果较小,但仍然可以观察。窗口对一种PR算法的影响只是很大。有趣的是,对于高质量的PR来说,三种算法都要求一套不同的参数,表明个人参数组对于在PR算法中进行公平比较的相关性。根据这些结果,我们制定了优化STFT参数参数参数对特定应用的精确度的指导方针。