We study the role of magnitude structured pruning as an architecture search to speed up the inference time of a deep noise suppression (DNS) model. While deep learning approaches have been remarkably successful in enhancing audio quality, their increased complexity inhibits their deployment in real-time applications. We achieve up to a 7.25X inference speedup over the baseline, with a smooth model performance degradation. Ablation studies indicate that our proposed network re-parameterization (i.e., size per layer) is the major driver of the speedup, and that magnitude structured pruning does comparably to directly training a model in the smaller size. We report inference speed because a parameter reduction does not necessitate speedup, and we measure model quality using an accurate non-intrusive objective speech quality metric.
翻译:我们研究规模结构裁剪的作用,作为加速深噪音抑制模型的推论时间的建筑搜索。深层的学习方法在提高音频质量方面非常成功,但其复杂性的提高抑制了其实时应用的部署。我们达到比基线加速7.25X的推论速度,并有一个平稳的模型性能退化。减缩研究表明,我们提议的网络重新校准(即每层的大小)是加速的主要驱动力,而规模结构裁剪与以较小尺寸直接训练模型相比是可比较的。我们报告推论速度是因为参数减少不需要加速,我们用精确的非侵入性客观语音质量衡量模型质量。