Gaussian Graphical Models are widely employed for modelling dependence among variables. Likewise, finite Gaussian mixtures are often the standard way to go for model-based clustering of continuous features. With the increasing availability of high-dimensional datasets, a methodological link between these two approaches has been established in order to provide a framework for performing penalized model-based clustering in the presence of large precision matrices. Notwithstanding, current methodologies do not account for the fact that groups may possess different degrees of association among the variables, thus implicitly assuming similar levels of sparsity across the classes. We overcome this limitation by deriving group-wise penalty factors, automatically enforcing under or over-connectivity in the estimated graphs. The approach is entirely data-driven and does not require any additional hyper-parameter specification. Simulated data experiments showcase the validity of our proposal.
翻译:Gausian 图形模型被广泛用于各变量之间的建模依赖性。类似地,有限的高斯混合物往往是基于模型的连续特征组合的标准方法。随着高维数据集越来越多,在这两种方法之间建立了方法联系,以便为在大型精密矩阵存在的情况下进行受处罚的基于模型的组合提供一个框架。尽管如此,目前的方法并没有考虑到以下事实,即各变量组可能具有不同程度的关联性,从而隐含地假定各类别之间类似的松散程度。我们克服了这一限制,我们从中推出群体性的处罚因素,在估计的图表中自动执行下方或过度连接性。这种方法完全是数据驱动的,不需要额外的超参数规格。模拟数据实验显示了我们提案的有效性。