Mixture models are widely used to fit complex and multimodal datasets. In this paper we study mixtures with high dimensional sparse latent parameter vectors and consider the problem of support recovery of those vectors. While parameter learning in mixture models is well-studied, the sparsity constraint remains relatively unexplored. Sparsity of parameter vectors is a natural constraint in variety of settings, and support recovery is a major step towards parameter estimation. We provide efficient algorithms for support recovery that have a logarithmic sample complexity dependence on the dimensionality of the latent space. Our algorithms are quite general, namely they are applicable to 1) mixtures of many different canonical distributions including Uniform, Poisson, Laplace, Gaussians, etc. 2) Mixtures of linear regressions and linear classifiers with Gaussian covariates under different assumptions on the unknown parameters. In most of these settings, our results are the first guarantees on the problem while in the rest, our results provide improvements on existing works.
翻译:混合模型被广泛用于适应复杂和多式数据集。 在本文中,我们研究高维稀释潜在参数矢量的混合物,并考虑支持回收这些矢量的问题。虽然对混合物模型的参数学习进行了深入的研究,但聚度限制仍然相对没有探索。参数矢量的分量是各种环境中的一种自然制约,支持回收是走向参数估计的一个重要步骤。我们提供了高效的算法,支持对潜在空间的维度具有对数样本复杂性依赖性的回收。我们的算法相当笼统,即适用于:(1) 多种不同罐体分布的混合物,包括统一、普瓦森、拉普特、高斯等。(2) 线性回归和线性分类的混合体和高斯共变体,其假设的参数不同。在大多数这些环境中,我们的结果是问题的第一个保障,而在其余环境中,我们的结果为现有工程提供了改进。