This paper provides a theoretical framework on the solution of feedforward ReLU networks for interpolations, in terms of what is called an interpolation matrix, which is the summary, extension and generalization of our three preceding works, with the expectation that the solution of engineering could be included in this framework and finally understood. To three-layer networks, we classify different kinds of solutions and model them in a normalized form; the solution finding is investigated by three dimensions, including data, networks and the training; the mechanism of overparameterization solutions is interpreted. To deep-layer networks, we present a general result called sparse-matrix principle, which could describe some basic behavior of deep layers and explain the phenomenon of the sparse-activation mode that appears in engineering applications associated with brain science; an advantage of deep layers compared to shallower ones is manifested in this principle. As applications, a general solution of deep neural networks for classification is constructed by that principle; and we also use the principle to study the data-disentangling property of encoders. Analogous to the three-layer case, the solution of deep layers is also explored through several dimensions. The mechanism of multi-output neural networks is explained from the perspective of interpolation matrices.
翻译:本文提供了一个理论框架,说明进料ReLU的内插网络的解决方案,即所谓的“内插矩阵”,即我们先前的三项工程的概要、扩展和概括,期望工程的解决方案可以纳入这一框架并最终理解。对于三层网络,我们分门别类,以标准化的形式建模这些解决方案;解决方案的发现由三个方面进行调查,包括数据、网络和培训;超度分解解决方案机制得到解释。对于深层网络,我们提出了一个一般结果,即稀释原则,可以描述深层的一些基本行为,并解释在与大脑科学有关的工程应用中出现的稀释活动模式现象;与浅层相比,深层层的优势在这一原则中表现出来。应用,根据这一原则构建了深度神经网络进行分类的一般解决方案;我们还利用这一原则研究诱变器的数据分解特性。对三层案例的解析,深层层的解决方案也通过多个层面的跨层网络加以探讨。多层网络机制从多个层面解释。