We study the approximation of two-layer compositions $f(x) = g(\phi(x))$ via deep networks with ReLU activation, where $\phi$ is a geometrically intuitive, dimensionality reducing feature map. We focus on two intuitive and practically relevant choices for $\phi$: the projection onto a low-dimensional embedded submanifold and a distance to a collection of low-dimensional sets. We achieve near optimal approximation rates, which depend only on the complexity of the dimensionality reducing map $\phi$ rather than the ambient dimension. Since $\phi$ encapsulates all nonlinear features that are material to the function $f$, this suggests that deep nets are faithful to an intrinsic dimension governed by $f$ rather than the complexity of the domain of $f$. In particular, the prevalent assumption of approximating functions on low-dimensional manifolds can be significantly relaxed using functions of type $f(x) = g(\phi(x))$ with $\phi$ representing an orthogonal projection onto the same manifold.
翻译:我们通过RELU激活的深网络研究两层构成的近似值$f(x) = g(phi(x)) 。 在RELU 激活时, $\\phi(x) = g(phi(x)) = g(f) ) 的深网络中, 美元是几何直观的、 维度的递减地貌地图。 我们注重于对美元的两个直观和实际相关的选择: 投射到低维嵌入的子体上, 距离低维数组群的距离。 我们实现接近最佳的近于最佳的近似近似值率, 这只取决于维度减少地图的复杂度 $(phi) = g(x) 美元, 而不是环境维度。 由于$\phi(x) 封装所有非线性特征都是函数的基值 $f$, 这表明深网忠实于由$f美元而不是美元域的复杂度管理的内在维度。 。 特别是, 低维的相函数的常见假设对低维数的约函数的假设可以大大放松。