Upsampling artifacts are caused by problematic upsampling layers and due to spectral replicas that emerge while upsampling. Also, depending on the used upsampling layer, such artifacts can either be tonal artifacts (additive high-frequency noise) or filtering artifacts (substractive, attenuating some bands). In this work we investigate the practical implications of having upsampling artifacts in the resulting audio, by studying how different artifacts interact and assessing their impact on the models' performance. To that end, we benchmark a large set of upsampling layers for music source separation: different transposed and subpixel convolution setups, different interpolation upsamplers (including two novel layers based on stretch and sinc interpolation), and different wavelet-based upsamplers (including a novel learnable wavelet layer). Our results show that filtering artifacts, associated with interpolation upsamplers, are perceptually preferrable, even if they tend to achieve worse objective scores.
翻译:抽取文物是由有问题的采样层和在采样过程中出现的光谱复制物造成的。 另外,根据用过的采样层,这些文物既可以是陶瓷工艺品(添加高频噪声),也可以是过滤工艺品(Substractive,减少一些波段 ) 。 在这项工作中,我们通过研究不同工艺品如何相互作用并评估其对模型性能的影响,调查在由此产生的音频中添加采样工艺品的实际影响。为此,我们为音乐源分离设定了一大套采样层:不同的移植和子像素聚合层设置,不同的中间采样器(包括两个基于伸缩和正弦的新的层),以及不同的波段上移器(包括一个新的可学习波段层 ) 。 我们的结果表明,过滤工艺品(与内插剂相联)在视觉上是可取的,即使它们往往达到更差的客观分数。