We propose differentiable artificial reverberation (DAR), a family of artificial reverberation (AR) models implemented in a deep learning framework. Combined with the modern deep neural networks (DNNs), the differentiable structure of DAR allows training loss gradients to be back-propagated in an end-to-end manner. Most of the AR models bottleneck training speed when implemented "as is" in the time domain and executed with a parallel processor like GPU due to their infinite impulse response (IIR) filter components. We tackle this by further developing a recently proposed acceleration technique, which borrows the frequency-sampling method (FSM). With the proposed DAR models, we aim to solve an artificial reverberation parameter (ARP) estimation task in a unified approach. We design an ARP estimation network applicable to both analysis-synthesis (RIR-to-ARP) and blind estimation (reverberant-speech-to-ARP) tasks. And using different DAR models only requires slightly a different decoder configuration. This way, the proposed DAR framework overcomes the previous methods' limitations of task-dependency and AR-model-dependency.
翻译:我们提出不同的人工回旋(DAR),这是一个在深深学习框架内实施的人工回动(AR)模型的组合。与现代深神经网络(DNNs)相结合,DAR的可变结构使得培训损失梯度能够以端到端的方式进行回传。实施时域“如现在”时,大多数AR模型的瓶颈培训速度与类似GPU的平行处理器(GPU)过滤器组件“如在时间域内”,并使用类似GPU的平行处理器执行。我们通过进一步开发最近提出的加速技术来解决这一问题,该技术将借用频率取样方法(FSM)。与拟议的DAR模型相结合,我们的目标是以统一的方式解决人工回动参数(ARP)估算任务。我们设计了适用于分析合成(RIR到ARP)和盲人估计(Reverant-speech到ARP)任务的ARP估计(RP)的AR估计网络。使用不同的DAR模型只需要稍微不同的解析配置。这样,拟议的DAR模型将克服以前的方法限制。