Recent advances in audio declipping have substantially improved the state of the art.% in certain saturation regimes. Yet, practitioners need guidelines to choose a method, and while existing benchmarks have been instrumental in advancing the field, larger-scale experiments are needed to guide such choices. First, we show that the clipping levels in existing small-scale benchmarks are moderate and call for benchmarks with more perceptually significant clipping levels. We then propose a general algorithmic framework for declipping that covers existing and new combinations of variants of state-of-the-art techniques exploiting time-frequency sparsity: synthesis vs. analysis sparsity, with plain or structured sparsity. Finally, we systematically compare these combinations and a selection of state-of-the-art methods. Using a large-scale numerical benchmark and a smaller scale formal listening test, we provide guidelines for various clipping levels, both for speech and various musical genres. The code is made publicly available for the purpose of reproducible research and benchmarking.
翻译:然而,实践者需要指南来选择一种方法,而现有的基准在推进实地工作方面起到了推动作用,但需要更大规模的实验来指导这种选择。 首先,我们表明,现有小规模基准的剪裁水平是适度的,要求基准具有更显著的剪裁水平。然后,我们提出一个一般的裁剪逻辑框架,涵盖利用时间频率超常的先进技术的变种的现有和新的组合:合成与分析孔径,与平坦或结构松懈相结合。最后,我们系统地比较这些组合和选择最先进的方法。我们使用大规模的数字基准和较小规模的正式监听测试,为各种剪裁水平提供指导,包括语音和各种音乐流派。为了进行可复制的研究和基准化的目的,该代码可以公开提供。