Deep neural networks trained end-to-end to map a measurement of a (noisy) image to a clean image perform excellent for a variety of linear inverse problems. Current methods are only trained on a few hundreds or thousands of images as opposed to the millions of examples deep networks are trained on in other domains. In this work, we study whether major performance gains are expected from scaling up the training set size. We consider image denoising, accelerated magnetic resonance imaging, and super-resolution and empirically determine the reconstruction quality as a function of training set size, while simultaneously scaling the network size. For all three tasks we find that an initially steep power-law scaling slows significantly already at moderate training set sizes. Interpolating those scaling laws suggests that even training on millions of images would not significantly improve performance. To understand the expected behavior, we analytically characterize the performance of a linear estimator learned with early stopped gradient descent. The result formalizes the intuition that once the error induced by learning the signal model is small relative to the error floor, more training examples do not improve performance.
翻译:深神经网络经过培训的端到端的深神经网络可以绘制一个测量( noisy) 图像到清洁图像的尺度, 对于各种线性反问题, 效果极佳。 目前的方法仅对数以百计或数千幅图像进行了培训, 而对于数以百万计的深网络则对其他领域进行了培训。 在这项工作中, 我们研究扩大培训设置规模是否会带来重大绩效收益。 我们认为图像脱色, 加速磁共振成像, 以及超分辨率和实证性决定重建质量是训练设置大小的函数, 同时缩放网络大小。 对于所有三项任务, 我们发现最初的电法大幅缩放速度已经大大放慢在中等训练设置尺寸上。 将这些缩放法进行国际化表明, 即使是对数百万个图像的培训也不会显著改善性能 。 为了理解预期的行为, 我们通过分析来测定通过早期停止的梯度下降而学的线性估测仪的性能。 结果将直觉确定, 一旦了解信号模型的错误与误差层相比小, 更多的培训实例不会改善性能 。