Most image denoising networks apply a single set of static convolutional kernels across the entire input image. This is sub-optimal for natural images, as they often consist of heterogeneous visual patterns. Dynamic convolution tries to address this issue by using per-pixel convolution kernels, but this greatly increases computational cost. In this work, we present Malleable Convolution (MalleConv), which performs spatial-varying processing with minimal computational overhead. MalleConv uses a smaller set of spatially-varying convolution kernels, a compromise between static and per-pixel convolution kernels. These spatially-varying kernels are produced by an efficient predictor network running on a downsampled input, making them much more efficient to compute than per-pixel kernels produced by a full-resolution image, and also enlarging the network's receptive field compared with static kernels. These kernels are then jointly upsampled and applied to a full-resolution feature map through an efficient on-the-fly slicing operator with minimum memory overhead. To demonstrate the effectiveness of MalleConv, we use it to build an efficient denoising network we call MalleNet. MalleNet achieves high-quality results without very deep architectures, making it 8.9x faster than the best performing denoising algorithms while achieving similar visual quality. We also show that a single MalleConv layer added to a standard convolution-based backbone can significantly reduce the computational cost or boost image quality at a similar cost. More information is on our project page: \url{https://yifanjiang.net/MalleConv.html}
翻译:大多数图像脱色网络在整个输入图像中应用了一组静态共变内核。 这是自然图像的亚最佳值, 因为它们通常包含不同视觉模式。 动态共变试图通过使用 perpixel 共变内核来解决这个问题, 但这会大大增加计算成本。 在这项工作中, 我们展示了 Malleable Convolution (Malle Convolution), 以最小的计算管理器进行空间翻转处理 。 Malle Convor 使用 更小的一组空间翻转内核内核。 这些内核在静和每平平流共内核之间有一个折叠。 这些空间翻动内核是由一个高效的预测或网络网络生成的, 通过一个高效的直观内核质量来解决这个问题 。 运行一个最高效的内核内核数据流 。 运行一个高效的内核数据网络 。 运行一个高效的内核数据, 运行一个最短的内核数据 。