This paper provides a theoretical framework on the solution of feedforward ReLU networks for interpolations, in terms of what is called an interpolation matrix, which is the summary, extension and generalization of our three preceding works, with the expectation that the solution of engineering could be included in this framework and finally understood. To three-layer networks, we classify different kinds of solutions and model them in a normalized form; the solution finding is investigated by three dimensions, including data, networks and the training; the mechanism of a type of overparameterization solution is interpreted. To deep-layer networks, we present a general result called sparse-matrix principle, which could describe some basic behavior of deep layers and explain the phenomenon of the sparse-activation mode that appears in engineering applications associated with brain science; an advantage of deep layers compared to shallower ones is manifested in this principle. As applications, a general solution of deep neural networks for classifications is constructed by that principle; and we also use the principle to study the data-disentangling property of encoders. Analogous to the three-layer case, the solution of deep layers is also explored through several dimensions. The mechanism of multi-output neural networks is explained from the perspective of interpolation matrices.
翻译:本文提供了一个理论框架,说明进料ReLU的内插网络的解决方案,即所谓的“内插矩阵”,即我们先前的三项工程的总结、扩展和概括,期望工程的解决方案可以纳入这一框架并最终理解。对于三层网络,我们分门别类,以标准化的形式建模这些解决方案;解决方案的发现由三个方面进行调查,包括数据、网络和培训;对一种过度分解的解决方案机制加以解释。对于深层网络,我们提出了一个一般结果,即稀释原则,可以描述深层的一些基本行为,并解释在与脑科学有关的工程应用中出现的稀释活动模式现象;对于三层网络而言,深层层与浅层相比的优势在这一原则中表现出来。应用,根据这一原则构建了深线网络的一般解决方案;我们还利用该原则研究对数据进行分解的特性。对三层案例进行分析,深层层层层层网络的解决方案也通过多个层面加以解释。