Phase retrieval is a problem encountered not only in speech and audio processing, but in many other fields such as optics. Iterative algorithms based on non-convex set projections are effective and frequently used for retrieving the phase when only STFT magnitudes are available. While the basic Griffin-Lim algorithm and its variants have been the prevalent method for decades, more recent advances, e.g. in optics, raise the question: Can we do better than Griffin-Lim for speech signals, using the same principle of iterative projection? In this paper we compare the classical algorithms in the speech domain with two modern methods from optics with respect to reconstruction quality and convergence rate. Based on this study, we propose to combine Griffin-Lim with the Difference Map algorithm in a hybrid approach which shows superior results, in terms of both convergence and quality of the final reconstruction.
翻译:阶段检索不仅在语音和音频处理中遇到一个问题,而且在光学等其他许多领域也遇到一个问题。 基于非convex集预测的迭代算法是有效的,经常用于在只有STFT 音量时检索该阶段。虽然基本Griffin-Lim算法及其变种几十年来一直是流行的方法,但最近的进展,例如光学方面的进展,提出了这样一个问题:用相同的迭接投法原则,我们能否在语音信号方面做得比Griffin-Lim好?在本文中,我们比较了语音域的古典算法,在重建质量和汇合率方面采用了两种现代的光学方法。根据这项研究,我们提议将Griffin-Lim与差异地图算法结合起来,采用混合方法,在最后重建的趋同和质量方面显示出优异的结果。