可区别的人工变化 (Differentiable Artificial Reverberation)

Artificial reverberation (AR) models play a central role in various audio applications. Therefore, estimating the AR model parameters (ARPs) of a target reverberation is a crucial task. Although a few recent deep-learning-based approaches have shown promising performance, their non-end-to-end training scheme prevents them from fully exploiting the potential of deep neural networks. This motivates to introduce differentiable artificial reverberation (DAR) models which allows loss gradients to be back-propagated end-to-end. However, implementing the AR models with their difference equations "as is" in the deep-learning framework severely bottlenecks the training speed when executed with a parallel processor like GPU due to their infinite impulse response (IIR) components. We tackle this problem by replacing the IIR filters with finite impulse response (FIR) approximations with the frequency-sampling method (FSM). Using the FSM, we implement three DAR models -- differentiable Filtered Velvet Noise (FVN), Advanced Filtered Velvet Noise (AFVN), and Feedback Delay Network (FDN). For each AR model, we train its ARP estimation networks for analysis-synthesis (RIR-to-ARP) and blind estimation (reverberant-speech-to-ARP) task in an end-to-end manner with its DAR model counterpart. Experiment results show that the proposed method achieves consistent performance improvement over the non-end-to-end approaches in both objective metrics and subjective listening test results.

翻译：人工回校模型在各种音频应用程序中发挥着核心作用。因此, 估算目标回校模型的AR模型参数(ARPs)是一项关键任务。尽管最近一些基于深层次学习的方法显示有良好的业绩, 但它们的非端对端培训计划阻止它们充分利用深层神经网络的潜力。这促使它们引入不同的人工回校模型(DAR), 使损失梯度能够反向再分析最终对端。然而, 在深层次学习框架中, 应用有差异的AR模型“ 不” 等式来“ 不” 严重阻塞与GPU等平行处理器执行的培训速度, 原因是它们具有无限的脉冲反应(IIR)组成部分。我们解决这个问题的方法是用有限的脉冲反应(FIR)来取代IR过滤器(FSM) 。我们运用了三种DAR模型 -- -- 不同过滤的Silveridal-Reabil- NS(FVVN) 、高级过滤VVVN(AVN) 和反馈的RED-R- Ral- Rest Ana- Restal rois- trisal 都显示每个A- Rest- trisal- AR- trisal- trisal- trisal- trisal- trisal- trislational- trisal- trisal- trisal- trisal- trisal- trislup- trisal- trisal- trisal- trisal- trisal- trisal- trisal- trisal- trisal- trisal- trisal- trisal- trislup to to trisal- to to to to to to to to tox) 任务任务, 。