We consider a statistical model for symmetric matrix factorization with additive Gaussian noise in the high-dimensional regime where the rank $M$ of the signal matrix to infer scales with its size $N$ as $M={\rm o}(\sqrt{\ln N})$. Allowing for an $N$-dependent rank offers new challenges and requires new methods. Working in the Bayes-optimal setting, we show that whenever the signal has i.i.d.~entries, the limiting mutual information between signal and data is given by a variational formula involving a rank-one replica symmetric potential. In other words, from the information-theoretic perspective, the case of a (slowly) growing rank is the same as when $M=1$ (namely, the standard spiked Wigner model). The proof is primarily based on a novel multiscale cavity method allowing for growing rank along with some information-theoretic identities on worst noise for the vector Gaussian channel. We believe that the cavity method developed here will play a role in the analysis of a broader class of inference and spin models where the degrees of freedom are large arrays instead of vectors.
翻译:暂无翻译