混合零散线回归的统计平衡取舍</s> (Statistical-Computational Tradeoffs in Mixed Sparse Linear Regression)

We consider the problem of mixed sparse linear regression with two components, where two real $k$-sparse signals $\beta_1, \beta_2$ are to be recovered from $n$ unlabelled noisy linear measurements. The sparsity is allowed to be sublinear in the dimension, and additive noise is assumed to be independent Gaussian with variance $\sigma^2$. Prior work has shown that the problem suffers from a $\frac{k}{SNR^2}$-to-$\frac{k^2}{SNR^2}$ statistical-to-computational gap, resembling other computationally challenging high-dimensional inference problems such as Sparse PCA and Robust Sparse Mean Estimation; here $SNR$ is the signal-to-noise ratio. We establish the existence of a more extensive computational barrier for this problem through the method of low-degree polynomials, but show that the problem is computationally hard only in a very narrow symmetric parameter regime. We identify a smooth information-computation tradeoff between the sample complexity $n$ and runtime for any randomized algorithm in this hard regime. Via a simple reduction, this provides novel rigorous evidence for the existence of a computational barrier to solving exact support recovery in sparse phase retrieval with sample complexity $n = \tilde{o}(k^2)$. Our second contribution is to analyze a simple thresholding algorithm which, outside of the narrow regime where the problem is hard, solves the associated mixed regression detection problem in $O(np)$ time with square-root the number of samples and matches the sample complexity required for (non-mixed) sparse linear regression; this allows the recovery problem to be subsequently solved by state-of-the-art techniques from the dense case. As a special case of our results, we show that this simple algorithm is order-optimal among a large family of algorithms in solving exact signed support recovery in sparse linear regression.

翻译：我们考虑的是混合的线性回归问题,其中有两个组成部分,即两个真实的美元- 平流信号 $\beeta_1,\beta_ 2美元,将从未贴贴贴标签的噪音线性测量中回收。允许在尺寸上分线,而添加的噪音则被认为是独立的高斯语,有差异 $\sgma=2美元。我们以前的工作已经表明,这个问题有2美元- 美元- 美元- 美元- 美元- 平流信号 $\\ beta_ 1,\beta_ 2美元- 美元- 统计- 平流化信号 $- 数字- 数字- 数字- 数字- 数字- 数字- 数字- 数字- 数字- 数字- 数字- 数字- 数字- 数字- 数字- 数字- 数字- 数字- 数字- 数字- 数字- 数字- 数字- 数字- 数字- 数字- 数字- 恢复过程- 数字- 速度- 数字- 数字- 速度- 速度- 速度- 速度- 速度- 速度- 速度- 速度- 速度- 速度- 速度- 速度- 速度- 速度- 速度- 速度- 分析- 速度- 速度- 速度- 速度- 速度- 速度- 速度- 速度- 速度- 速度- 速度- 速度- 速度- 速度- 速度- 速度- 速度- 速度-</s>