Model-based neural networks provide unparalleled performance for various tasks, such as sparse coding and compressed sensing problems. Due to the strong connection with the sensing model, these networks are interpretable and inherit prior structure of the problem. In practice, model-based neural networks exhibit higher generalization capability compared to ReLU neural networks. However, this phenomenon was not addressed theoretically. Here, we leverage complexity measures including the global and local Rademacher complexities, in order to provide upper bounds on the generalization and estimation errors of model-based networks. We show that the generalization abilities of model-based networks for sparse recovery outperform those of regular ReLU networks, and derive practical design rules that allow to construct model-based networks with guaranteed high generalization. We demonstrate through a series of experiments that our theoretical insights shed light on a few behaviours experienced in practice, including the fact that ISTA and ADMM networks exhibit higher generalization abilities (especially for small number of training samples), compared to ReLU networks.
翻译:模型导向神经网络在稀疏编码和压缩感知问题等各种任务中提供了无与伦比的性能。由于与传感模型的强关联,这些网络具有解释性并继承问题的先验结构。实际上,与ReLU神经网络相比,模型导向神经网络表现出更强的泛化能力。然而,这种现象尚未得到理论上的解析。在这里,我们利用包括全局和局部Rademacher复杂度在内的复杂度度量,提供模型导向网络泛化和估计误差的上界。我们表明,在稀疏恢复方面,模型导向网络的泛化能力优于常规ReLU网络,并推导出实际设计规则,以构建具有高保证泛化能力的模型导向网络。我们通过一系列实验证明,我们的理论洞察力揭示了实践中的一些行为,包括ISTA和ADMM网络的泛化能力更高(特别是对于少量的训练样本)相比于ReLU网络。