In the Wishart model for sparse PCA we are given $n$ samples $Y_1,\ldots, Y_n$ drawn independently from a $d$-dimensional Gaussian distribution $N({0, Id + \beta vv^\top})$, where $\beta > 0$ and $v\in \mathbb{R}^d$ is a $k$-sparse unit vector, and we wish to recover $v$ (up to sign). We show that if $n \ge \Omega(d)$, then for every $t \ll k$ there exists an algorithm running in time $n\cdot d^{O(t)}$ that solves this problem as long as \[ \beta \gtrsim \frac{k}{\sqrt{nt}}\sqrt{\ln({2 + td/k^2})}\,. \] Prior to this work, the best polynomial time algorithm in the regime $k\approx \sqrt{d}$, called \emph{Covariance Thresholding} (proposed in [KNV15a] and analyzed in [DM14]), required $\beta \gtrsim \frac{k}{\sqrt{n}}\sqrt{\ln({2 + d/k^2})}$. For large enough constant $t$ our algorithm runs in polynomial time and has better guarantees than Covariance Thresholding. Previously known algorithms with such guarantees required quasi-polynomial time $d^{O(\log d)}$. In addition, we show that our techniques work with sparse PCA with adversarial perturbations studied in [dKNS20]. This model generalizes not only sparse PCA, but also other problems studied in prior works, including the sparse planted vector problem. As a consequence, we provide polynomial time algorithms for the sparse planted vector problem that have better guarantees than the state of the art in some regimes. Our approach also works with the Wigner model for sparse PCA. Moreover, we show that it is possible to combine our techniques with recent results on sparse PCA with symmetric heavy-tailed noise [dNNS22]. In particular, in the regime $k \approx \sqrt{d}$ we get the first polynomial time algorithm that works with symmetric heavy-tailed noise, while the algorithm from [dNNS22]. requires quasi-polynomial time in these settings.
翻译:在稀有的五氯苯的希望模式中, 我们得到的是美元样本 $Y_ 1,\ldot, Y_n$, 独立于美元维度的高斯分配 $({{0, Id +\beta vv ⁇ top})$, $Beta > 0美元和$v\ in\ mathbb{R ⁇ d$ 是美元, 并且我们希望回收 $v$ (上签名) 。 我们显示, 如果 $\ge\ Omega (d) 美元, 那么对于每美元维度的美元来说, 美元维度的O_ndro 分配 $(美元, Idd +beta) 美元分配 美元分配 美元分配 美元分配 美元, 以[\ beta\ tk\ sqrt{r\\\\\ sqrqral] 方式解决了这个问题, 也以( 20d/kxxxxxx 时间) 。 在这项工作之前, 以 美元保理( 美元保理 美元制度中, 也以美元比 美元马氏 更多的时间( 显示需要。