This paper addresses the task of estimating a covariance matrix under a patternless sparsity assumption. In contrast to existing approaches based on thresholding or shrinkage penalties, we propose a likelihood-based method that regularizes the distance from the covariance estimate to a symmetric sparsity set. This formulation avoids unwanted shrinkage induced by more common norm penalties and enables optimization of the resulting non-convex objective by solving a sequence of smooth, unconstrained subproblems. These subproblems are generated and solved via the proximal distance version of the majorization-minimization principle. The resulting algorithm executes rapidly, gracefully handles settings where the number of parameters exceeds the number of cases, yields a positive definite solution, and enjoys desirable convergence properties. Empirically, we demonstrate that our approach outperforms competing methods by several metrics across a suite of simulated experiments. Its merits are illustrated on an international migration dataset and a classic case study on flow cytometry. Our findings suggest that the marginal and conditional dependency networks for the cell signalling data are more similar than previously concluded.
翻译:与基于临界值或收缩处罚的现有方法不同,我们建议一种基于可能性的方法,使从共差估计数到对称宽度集成的距离规范化。这种公式避免了因更常见的规范处罚而不必要的缩小,并通过解决一个平滑、不受限制的子问题序列,优化由此产生的非共性目标。这些子问题是通过主要最小化原则的准x距离版本产生和解决的。由此产生的算法快速、优雅地处理参数数量超过案件数量的设置,产生一个肯定的解决方案,并具有理想的趋同性。我们很生动地证明,我们的方法与一系列模拟实验中的若干衡量标准相冲突的方法格格不入。其优点通过国际移徙数据集和关于流动细胞测量的典型案例研究加以说明。我们的研究结果表明,细胞信号数据的边际和有条件依赖网络比先前得出的结论更为相似。