Diffusion probabilistic models have been recently used in a variety of tasks, including speech enhancement and synthesis. As a generative approach, diffusion models have been shown to be especially suitable for imputation problems, where missing data is generated based on existing data. Phase retrieval is inherently an imputation problem, where phase information has to be generated based on the given magnitude. In this work we build upon previous work in the speech domain, adapting a speech enhancement diffusion model specifically for STFT phase retrieval. Evaluation using speech quality and intelligibility metrics shows the diffusion approach is well-suited to the phase retrieval task, with performance surpassing both classical and modern methods.
翻译:传播概率模型最近被用于多种任务,包括语言增强和合成,作为一种基因化方法,传播模型已证明特别适合估算问题,因为缺少的数据是根据现有数据生成的。阶段检索本身就是一个估算问题,阶段信息必须根据给定的规模生成。在这项工作中,我们在语音领域以往工作的基础上,为STFT阶段的检索专门改编了语音增强扩散模型。使用语言质量和智能度量进行评估表明,传播方法非常适合阶段检索任务,其性能超过传统和现代方法。