We address the task of identifying densely connected subsets of multivariate Gaussian random variables within a graphical model framework. We propose two novel estimators based on the Ordered Weighted $\ell_1$ (OWL) norm: 1) The Graphical OWL (GOWL) is a penalized likelihood method that applies the OWL norm to the lower triangle components of the precision matrix. 2) The column-by-column Graphical OWL (ccGOWL) estimates the precision matrix by performing OWL regularized linear regressions. Both methods can simultaneously identify highly correlated groups of variables and control the sparsity in the resulting precision matrix. We formulate GOWL such that it solves a composite optimization problem and establish that the estimator has a unique global solution. In addition, we prove sufficient grouping conditions for each column of the ccGOWL precision matrix estimate. We propose proximal descent algorithms to find the optimum for both estimators. For synthetic data where group structure is present, the ccGOWL estimator requires significantly reduced computation and achieves similar or greater accuracy than state-of-the-art estimators. Timing comparisons are presented and demonstrates the superior computational efficiency of the ccGOWL. We illustrate the grouping performance of the ccGOWL method on a cancer gene expression data set and an equities data set.
翻译:我们的任务是在一个图形模型框架内确定多变 Gausian 随机变量的密连子子集。 我们根据有顺序加权$@ell_1$1$(OWL)规范提出两个新的估计标准:(1) 图形 OWL(GOWL)是一种将 OWL 规范应用到精确矩阵下三角元组成部分的有惩罚性的可能性方法。(2) 逐列图形OWL(ccGOWL)通过执行 OWL 常规线性回归来估计精确矩阵。两种方法都可以同时确定高度关联的变量组和控制由此产生的精密矩阵中的松散性。 我们制定 GOWL, 从而解决一个复合优化问题, 并确定估计器有一个独特的全球解决方案。 此外, 我们证明对 ccGOWL 精确矩阵估算的每列都有足够的组合条件。 我们建议了最准的血统算法, 以找到两个估计器的最佳方法。 对于组结构所在的合成数据, ccGOWL 估测算器需要大幅降低计算结果, 并实现比GOV的精度的精度的精度。