Deep neural networks trained end-to-end to map a measurement of a (noisy) image to a clean image perform excellent for a variety of linear inverse problems. Current methods are only trained on a few hundreds or thousands of images as opposed to the millions of examples deep networks are trained on in other domains. In this work, we study whether major performance gains are expected from scaling up the training set size. We consider image denoising, accelerated magnetic resonance imaging, and super-resolution and empirically determine the reconstruction quality as a function of training set size, while optimally scaling the network size. For all three tasks we find that an initially steep power-law scaling slows significantly already at moderate training set sizes. Interpolating those scaling laws suggests that even training on millions of images would not significantly improve performance. To understand the expected behavior, we analytically characterize the performance of a linear estimator learned with early stopped gradient descent. The result formalizes the intuition that once the error induced by learning the signal model is small relative to the error floor, more training examples do not improve performance.
翻译:深神经网络经过培训的端到端的深神经网络可以绘制一个测量( noisy) 图像到清洁图像的尺度, 对于各种线性反问题, 效果极佳。 目前的方法仅对数以百计或数千幅图像进行了培训, 而对于数以百万计的深网络则在其他领域接受了培训。 在这项工作中, 我们研究扩大培训设置规模是否会带来重大绩效收益。 我们考虑图像脱色, 加速磁共振成像, 以及超分辨率和实验性地确定重建质量是训练设置大小的函数, 同时优化网络大小。 对于所有三项任务, 我们发现最初的电法缩放速度在中等训练设置大小时已经大大放缓。 将这些缩放法进行国际化表明, 即使对数以百万计的图像进行培训也不会显著改善性能 。 为了理解预期的行为, 我们分析一下早期停止梯度下降所学的线性测量器的性能。 其结果使直觉化地证明, 一旦了解信号模型引起的差小于误差层, 更多的培训实例不会改善性能 。