This article proposes a graphical model that handles mixed-type, multi-group data. The motivation for such a model originates from real-world observational data, which often contain groups of samples obtained under heterogeneous conditions in space and time, potentially resulting in differences in network structure among groups. Therefore, the i.i.d. assumption is unrealistic, and fitting a single graphical model on all data results in a network that does not accurately represent the between group differences. In addition, real-world observational data is typically of mixed discrete-and-continuous type, violating the Gaussian assumption that is typical of graphical models, which leads to the model being unable to adequately recover the underlying graph structure. The proposed model takes into account these properties of data, by treating observed data as transformed latent Gaussian data, by means of the Gaussian copula, and thereby allowing for the attractive properties of the Gaussian distribution such as estimating the optimal number of model parameter using the inverse covariance matrix. The multi-group setting is addressed by jointly fitting a graphical model for each group, and applying the fused group penalty to fuse similar graphs together. In an extensive simulation study, the proposed model is evaluated against alternative models, where the proposed model is better able to recover the true underlying graph structure for different groups. Finally, the proposed model is applied on real production-ecological data pertaining to on-farm maize yield in order to showcase the added value of the proposed method in generating new hypotheses for production ecologists.
翻译:本条提出一个处理混合类型、多组数据的图形模型。这种模型的动机来自真实世界观测数据,这些数据往往包含在空间和时间上不同条件下获得的样本群,有可能造成各组之间网络结构的差异。因此,i.i.d.假设不现实,并且将所有数据结果的单一图形模型安装在一个不准确代表群体差异的网络中。此外,真实世界观测数据通常具有混合、互不相连和连续的类型,违反了典型的图形模型模型的假设,导致模型无法充分恢复基本图形结构。提议的模型考虑到数据的这些特性,通过高斯方格对变化的深层高格数据进行处理,从而允许高斯分布具有吸引力的特性,例如利用逆差变矩阵估算模型的最佳数目。多组设置是通过联合为每个组配置一个图形模型,并将组合电子组的处罚适用于类似图表结构结构。在模拟模型中,拟议采用更精确的模型,最终在模拟模型中,在模拟中,在模拟中,拟议采用更精确的结构结构结构中,在模拟中,拟议采用更精细的模型,在模拟中,在模拟中,在模拟中,在模拟中,在模拟中,在模拟中,在模拟中,对模拟的模型进行更精确的模型进行更精确的模型进行模拟的模拟的模型进行后,最后将拟议的再评估。