When can the input of a ReLU neural network be inferred from its output? In other words, when is the network injective? We consider a single layer, $x \mapsto \mathrm{ReLU}(Wx)$, with a random Gaussian $m \times n$ matrix $W$, in a high-dimensional setting where $n, m \to \infty$. Recent work connects this problem to spherical integral geometry giving rise to a conjectured sharp injectivity threshold for $\alpha = \frac{m}{n}$ by studying the expected Euler characteristic of a certain random set. We adopt a different perspective and show that injectivity is equivalent to a property of the ground state of the spherical perceptron, an important spin glass model in statistical physics. By leveraging the (non-rigorous) replica symmetry-breaking theory, we derive analytical equations for the threshold whose solution is at odds with that from the Euler characteristic. Furthermore, we use Gordon's min--max theorem to prove that a replica-symmetric upper bound refutes the Euler characteristic prediction. Along the way we aim to give a tutorial-style introduction to key ideas from statistical physics in an effort to make the exposition accessible to a broad audience. Our analysis establishes a connection between spin glasses and integral geometry but leaves open the problem of explaining the discrepancies.
翻译:何时能从输出中推断 ReLU 神经网络的输入? 换句话说, 网络的预测值是什么时候? 我们考虑一个单层, $x\ mapto\ maphrm{ reLU} (Wx)$x, 随机高盛$m\ times ntims n$W$W$, 在一个高空环境中, 一个高空概念值, 一个重要的旋转玻璃模型。 通过利用( 非硬的) 复制的对称破碎理论, 我们用一个分析方程式来得出 $\ alpha =\ frac{m<unk> n} 的预测值, 从而得出一个直径直径值的直径直射值值值值, 通过研究某个随机集集的预期的预期尤利值特征。 我们采用了不同的视角, 并显示, 直径直径直的直射力相当于球形镜的地面状态的属性, 一个重要的旋转玻璃模型。 通过利用( 非硬的) 复制的对称对称的对称理论的理论, 我们从直径直径直径直径直径直径直径直径的直径直径直径直径直径直径直径直径直线线线的直径直径直的直径直径直线线线线线线线, 。</s>