The LASSO is a recent technique for variable selection in the regression model \bean y & = & X\beta + z, \eean where $X\in \R^{n\times p}$ and $z$ is a centered gaussian i.i.d. noise vector $\mathcal N(0,\sigma^2I)$. The LASSO has been proved to achieve remarkable properties such as exact support recovery of sparse vectors when the columns are sufficently incoherent and low prediction error under even less stringent conditions. However, many matrices do not satisfy small coherence in practical applications and the LASSO estimator may thus suffer from what is known as the slow rate regime. The goal of the present paper is to study the LASSO from a slightly different perspective by proposing a mixture model for the design matrix which is able to capture in a natural way the potentially clustered nature of the columns in many practical situations. In this model, the columns of the design matrix are drawn from a Gaussian mixture model. Instead of requiring incoherence for the design matrix $X$, we only require incoherence of the much smaller matrix of the mixture's centers. Our main result states that $X\beta$ can be estimated with the same precision as for incoherent designs except for a correction term depending on the maximal variance in the mixture model.
翻译:LASSO是回归模型\ bean 和 & = & X\beta + z, 和 z, 和 z 的变量选择技术, 其中, 美元X\ in\\\\\\\n\timep} p} 美元和 z$Z 是一个核心的 Goussian i. d. 噪音矢量 $\ mathcal N( 0,\\\ sigma=2I) 。 LASSO 被证明具有显著的特性, 例如当柱体足够不连贯和低预测误差的情况下, 稀薄矢量的回收。 然而, 许多矩阵在实际应用中并不满足小的一致性, 因此, LASSOS 估计员可能因所谓的慢速制度而受到影响。 本文的目的是从稍不同的角度研究 LASSO, 提出设计矩阵的混合模型, 在许多实际情况下, 能够自然地捕捉到柱体可能集中的特性。 在这个模型中, 设计矩阵的柱子只能从高斯混合模型中提取, 和低差值模型。 我们的模型的精确度 要求在主轴中, exexxxx 的模型中, 我们的精确值 的模型中, 需要在主体的精确值 。