We revisit the mean field parametrization of shallow neural networks, using signed measures on unbounded parameter spaces and duality pairings that take into account the regularity and growth of activation functions. This setting directly leads to the use of unbalanced Kantorovich-Rubinstein norms defined by duality with Lipschitz functions, and of spaces of measures dual to those of continuous functions with controlled growth. These allow to make transparent the need for total variation and moment bounds or penalization to obtain existence of minimizers of variational formulations, under which we prove a compactness result in strong Kantorovich-Rubinstein norm, and in the absence of which we show several examples demonstrating undesirable behavior. Further, the Kantorovich-Rubinstein setting enables us to combine the advantages of a completely linear parametrization and ensuing reproducing kernel Banach space framework with optimal transport insights. We showcase this synergy with representer theorems and uniform large data limits for empirical risk minimization, and in proposed formulations for distillation and fusion applications.
翻译:本文重新审视浅层神经网络的平均场参数化方法,采用无界参数空间上的符号测度以及考虑激活函数正则性与增长性的对偶配对。这一设定直接导向由Lipschitz函数对偶定义的非平衡Kantorovich-Rubinstein范数的应用,以及受控增长连续函数对偶空间上的测度空间。这些工具清晰地揭示了为获得变分形式极小值的存在性所需的总变差与矩约束或惩罚项的必要性——在此条件下我们证明了强Kantorovich-Rubinstein范数下的紧致性结果,而未施加约束时则通过若干示例展示了非理想行为。此外,Kantorovich-Rubinstein框架使我们能够将完全线性参数化及其衍生的再生核Banach空间架构的优势与最优传输理论相结合。我们通过经验风险最小化的表示定理与一致大样本极限,以及在蒸馏与融合应用中提出的数学表述,展示了该协同效应的实际价值。