We prove several universal approximation results at minimal or near-minimal width for approximation of $L^p(\mathbb{R}^{d_x}, \mathbb{R}^{d_y})$ and $C^0(\mathbb{R}^{d_x}, \mathbb{R}^{d_y})$ on compact sets. Our approach uses a unified coding scheme that yields explicit constructions relying only on standard analytic tools. We show that feedforward neural networks with two leaky ReLU activations $σ_α$, $σ_{-α}$ achieve the optimal width $\max\{d_x, d_y\}$ for $L^p$ approximation, while a single leaky ReLU $σ_α$ achieves width $\max\{2, d_x, d_y\}$, providing an alternative proof of the results of Cai et al. (2023). By generalizing to stepped leaky ReLU activations, we extend these results to uniform approximation of continuous functions while identifying sets of activation functions compatible with gradient-based training. Since our constructions pass through an intermediate dimension of one, they imply that autoencoders with a one-dimensional feature space are universal approximators. We further show that squashable activations combined with FLOOR achieve width $\max\{3, d_x, d_y\}$ for uniform approximation. We also establish a lower bound of $\max\{d_x, d_y\} + 1$ for networks when all activations are continuous and monotone and $d_y \leq 2d_x$. Moreover, we extend our results to invertible LU-decomposable networks, proving distributional universal approximation for LU-Net normalizing flows and providing a constructive proof of the classical theorem of Brenier and Gangbo on $L^p$ approximation by diffeomorphisms.
翻译:我们证明了在最小或接近最小宽度下,用于逼近$L^p(\mathbb{R}^{d_x}, \mathbb{R}^{d_y})$函数及紧集上$C^0(\mathbb{R}^{d_x}, \mathbb{R}^{d_y})$函数的若干通用逼近结果。我们的方法采用统一的编码方案,仅依赖标准分析工具即可得到显式构造。我们证明:对于$L^p$逼近,使用两个泄漏ReLU激活函数$σ_α$和$σ_{-α}$的前馈神经网络可实现最优宽度$\max\{d_x, d_y\}$;而使用单个泄漏ReLU激活函数$σ_α$的网络可实现宽度$\max\{2, d_x, d_y\}$,这为Cai等人(2023)的结果提供了另一种证明。通过推广至阶梯泄漏ReLU激活函数,我们将这些结果延伸至连续函数的一致逼近,同时识别出与基于梯度的训练兼容的激活函数集合。由于我们的构造需经过一维中间表示,这意味着具有一维特征空间的自编码器是通用逼近器。我们进一步证明:结合FLOOR运算的可压缩激活函数可实现宽度$\max\{3, d_x, d_y\}$的一致逼近。对于所有激活函数连续且单调、且满足$d_y \leq 2d_x$的网络,我们还建立了$\max\{d_x, d_y\} + 1$的下界。此外,我们将结果推广至可逆的LU可分解网络,证明了LU-Net归一化流的分布通用逼近性,并为Brenier和Gangbo关于微分同胚$L^p$逼近的经典定理提供了构造性证明。