There has been a recent surge of interest in the study of asymptotic reconstruction performance in various cases of generalized linear estimation problems in the teacher-student setting, especially for the case of i.i.d standard normal matrices. Here, we go beyond these matrices, and prove an analytical formula for the reconstruction performance of convex generalized linear models with rotationally-invariant data matrices with arbitrary bounded spectrum, rigorously confirming a conjecture originally derived using the replica method from statistical physics. The formula includes many problems such as compressed sensing or sparse logistic classification. The proof is achieved by leveraging on message passing algorithms and the statistical properties of their iterates, allowing to characterize the asymptotic empirical distribution of the estimator. Our proof is crucially based on the construction of converging sequences of an oracle multi-layer vector approximate message passing algorithm, where the convergence analysis is done by checking the stability of an equivalent dynamical system. We illustrate our claim with numerical examples on mainstream learning methods such as sparse logistic regression and linear support vector classifiers, showing excellent agreement between moderate size simulation and the asymptotic prediction.
翻译:最近,在师生背景中,特别是在i.d标准正常矩阵的情况下,对普遍线性估算问题的各种情况,对无症状重建绩效的研究表现出了浓厚的兴趣。在这里,我们超越了这些矩阵,并证明一个分析公式,用于对具有任意封闭频谱的旋转不定数据矩阵的锥形通用线性模型的重建绩效进行分析,严格证实最初使用统计物理复制法得出的推测。该公式包括了压缩感测或稀少后勤分类等许多问题。通过利用电文传递算法及其迭代的统计特性,从而得以取得证据,从而能够确定估计天体的天体性经验分布特征。我们的证据至关重要地建立在构建一个或角多层矢量近似电文传递算法的相趋同序列的基础上,通过检查一个相当的动态系统的稳定性来进行趋同分析。我们用关于主流学习方法的数字例子来说明我们的索赔,例如:稀少的物流回归和线性支持矢量分解器,显示中度模拟与零度预测之间的极一致。