We explain how to use Kolmogorov's Superposition Theorem (KST) to overcome the curse of dimensionality in approximating multi-dimensional functions and learning multi-dimensional data sets by using neural networks of two layers. That is, there is a class of functions called $K$-Lipschitz continuous in the sense that the K-outer function $g$ of $f$ is Lipschitz continuous can be approximated by a ReLU network of two layers with $dn, n$ widths to have an approximation order $O(d^2/n)$. In addition, we show that polynomials of high degree can be expressed by using neural networks with activation function $\sigma_\ell(t)=(t_+)^\ell$ with $\ell\ge 2$ with multiple layers and appropriate widths. More layers of neural networks, the higher degree polynomials can be reproduced. Hence, the deep learning algorithm can well approximate multi-dimensional data when the number of layers increases with high degree activation function $\sigma_\ell$. Finally, we present a mathematical justification for image classification by using a deep learning algorithm.
翻译:我们解释如何使用Kolmogorov的超位置理论(KST)来通过使用两层神经网络来克服维度在近似多维功能和学习多维数据集中的诅咒。 也就是说, 有一种叫做$K$- Lipschitz 连续的功能类别, 也就是K- Exerer 函数为$g$f$, 是 Lipschitz 连续的。 更多的神经网络层, 可以复制更高程度的多面值。 因此, 深层次的算法可以非常接近多维数据, 当层数随着高度活化功能增加时, $( d ⁇ 2/n) $。 最后, 我们用一个数学学的图表来学习 。