Structural matrix-variate observations routinely arise in diverse fields such as multi-layer network analysis and brain image clustering. While data of this type have been extensively investigated with fruitful outcomes being delivered, the fundamental questions like its statistical optimality and computational limit are largely under-explored. In this paper, we propose a low-rank Gaussian mixture model (LrMM) assuming each matrix-valued observation has a planted low-rank structure. Minimax lower bounds for estimating the underlying low-rank matrix are established allowing a whole range of sample sizes and signal strength. Under a minimal condition on signal strength, referred to as the information-theoretical limit or statistical limit, we prove the minimax optimality of a maximum likelihood estimator which, in general, is computationally infeasible. If the signal is stronger than a certain threshold, called the computational limit, we design a computationally fast estimator based on spectral aggregation and demonstrate its minimax optimality. Moreover, when the signal strength is smaller than the computational limit, we provide evidences based on the low-degree likelihood ratio framework to claim that no polynomial-time algorithm can consistently recover the underlying low-rank matrix. Our results reveal multiple phase transitions in the minimax error rates and the statistical-to-computational gap. Numerical experiments confirm our theoretical findings. We further showcase the merit of our spectral aggregation method on the worldwide food trading dataset.
翻译:在多层网络分析和大脑图像群集等不同领域,经常出现结构矩阵差异观测。虽然这种类型的数据已经进行了广泛调查,并取得了丰硕的成果,但统计优化和计算限制等基本问题基本上没有得到充分探讨。在本文中,我们建议采用低级别高斯混合模型(LrMM),假设每个矩阵值观察都有一个低层次结构。为估计基础低层次矩阵而建立的最小低层界限,允许整个样本大小和信号强度。在信号强度的最低条件下,即信息-理论限制或统计限制,我们证明一个最大可能性估测器的最小性最佳性,一般来说,这是计算上不可行的。如果信号比某一阈值强,我们称之为计算极限,我们设计一个基于光谱汇总的计算快速估计器,并显示其微度最佳性。此外,当信号强度小于计算限度时,我们根据低度交易概率或统计极限框架提供证据,我们根据低度的低度可能比率框架提供证据,我们声称一个最大可能性估测度的估测算器,即没有多级统计阶段的估测算法。我们低级统计阶段的估测算法,可以恢复我们低级的估测结果。