Copula 异源混合数据图形模型 (Copula Graphical Models for Heterogeneous Mixed Data)

This article proposes a graphical model that handles mixed-type, multi-group data. The motivation for such a model originates from real-world observational data, which often contain groups of samples obtained under heterogeneous conditions in space and time, potentially resulting in differences in network structure among groups. Therefore, the i.i.d. assumption is unrealistic, and fitting a single graphical model on all data results in a network that does not accurately represent the between group differences. In addition, real-world observational data is typically of mixed discrete-and-continuous type, violating the Gaussian assumption that is typical of graphical models, which leads to the model being unable to adequately recover the underlying graph structure. The proposed model takes into account these properties of data, by treating observed data as transformed latent Gaussian data, by means of the Gaussian copula, and thereby allowing for the attractive properties of the Gaussian distribution such as estimating the optimal number of model parameter using the inverse covariance matrix. The multi-group setting is addressed by jointly fitting a graphical model for each group, and applying the fused group penalty to fuse similar graphs together. In an extensive simulation study, the proposed model is evaluated against alternative models, where the proposed model is better able to recover the true underlying graph structure for different groups. Finally, the proposed model is applied on real production-ecological data pertaining to on-farm maize yield in order to showcase the added value of the proposed method in generating new hypotheses for production ecologists.

翻译：本条提出一个处理混合类型、多组数据的图形模型。这种模型的动机来自真实世界观测数据,这些数据往往包含在空间和时间上不同条件下获得的样本群,有可能造成各组之间网络结构的差异。因此,i.i.d.假设不现实,并且将所有数据结果的单一图形模型安装在一个不准确代表群体差异的网络中。此外,真实世界观测数据通常具有混合、互不相连和连续的类型,违反了典型的图形模型模型的假设,导致模型无法充分恢复基本图形结构。提议的模型考虑到数据的这些特性,通过高斯方格对变化的深层高格数据进行处理,从而允许高斯分布具有吸引力的特性,例如利用逆差变矩阵估算模型的最佳数目。多组设置是通过联合为每个组配置一个图形模型,并将组合电子组的处罚适用于类似图表结构结构。在模拟模型中,拟议采用更精确的模型,最终在模拟模型中,在模拟中,在模拟中,拟议采用更精确的结构结构结构中,在模拟中,拟议采用更精细的模型,在模拟中,在模拟中,在模拟中,在模拟中,在模拟中,在模拟中,在模拟中,对模拟的模型进行更精确的模型进行更精确的模型进行模拟的模拟的模型进行后,最后将拟议的再评估。

相关内容

MoDELS

关注 43

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【ICDM 2022教程】图挖掘中的公平性:度量、算法和应用

专知会员服务

28+阅读 · 2022年12月26日

【新书】贝叶斯网络进展与新应用，附全书下载

专知会员服务

122+阅读 · 2019年12月9日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日