Recovering a planted vector $v$ in an $n$-dimensional random subspace of $\mathbb{R}^N$ is a generic task related to many problems in machine learning and statistics, such as dictionary learning, subspace recovery, principal component analysis, and non-Gaussian component analysis. In this work, we study computationally efficient estimation and detection of a planted vector $v$ whose $\ell_4$ norm differs from that of a Gaussian vector with the same $\ell_2$ norm. For instance, in the special case where $v$ is an $N \rho$-sparse vector with Bernoulli-Gaussian or Bernoulli-Rademacher entries, our results include the following: (1) We give an improved analysis of a slight variant of the spectral method proposed by Hopkins, Schramm, Shi, and Steurer (2016), showing that it approximately recovers $v$ with high probability in the regime $n \rho \ll \sqrt{N}$. This condition subsumes the conditions $\rho \ll 1/\sqrt{n}$ or $n \sqrt{\rho} \lesssim \sqrt{N}$ required by previous work up to polylogarithmic factors. We achieve $\ell_\infty$ error bounds for the spectral estimator via a leave-one-out analysis, from which it follows that a simple thresholding procedure exactly recovers $v$ with Bernoulli-Rademacher entries, even in the dense case $\rho = 1$. (2) We study the associated detection problem and show that in the regime $n \rho \gg \sqrt{N}$, any spectral method from a large class (and more generally, any low-degree polynomial of the input) fails to detect the planted vector. This matches the condition for recovery and offers evidence that no polynomial-time algorithm can succeed in recovering a Bernoulli-Gaussian vector $v$ when $n \rho \gg \sqrt{N}$.
翻译:以 $@mathb{R ⁇ N$N$美元 重塑一个植入的矢量 $美元 在一个以立方体为立方体的随机亚空间 $m@mathb{R ⁇ N$是一个与机器学习和统计方面的许多问题有关的通用任务,例如字典学习、子空间恢复、主要组成部分分析和非伽西文组成部分分析。在这项工作中,我们研究了一个高效估算和探测一个植入的矢量 $v$V$,其标准值与高斯矢量的矢量值值值标准值相同 $@ell_2美元 立方程式标准相同。例如,如果美元是美元 美元,则在Bernoul-Gausian 或伯努利-拉德马赫赫克马格罗条目中,这个条件的下限值是 $r@r=r=rqr=qeur 方法。我们改进了对霍普斯、施拉姆、希和施特尔格勒提出的光谱方法的微变法的分析,表明它大约回收了美元, 其概率为美元,在制度下的任何概率为 美元 美元 美元 的概率为 美元 。