This article proposes a graphical model that can handle mixed-type, multi-group data. The motivation for such a model originates from real-world observational data, which often contain groups of samples obtained under heterogeneous conditions in space and time, potentially resulting in differences in network structure among groups. Therefore, the i.i.d. assumption is unrealistic, and fitting a single graphical model on all data results in a network that does not accurately represent the between group differences. In addition, real-world observational data is typically of mixed-type, violating the Gaussian assumption that is typical of graphical models, which leads to the model being unable to adequately recover the underlying graph structure. The proposed model takes into account these properties of data, by treating observed data as transformed latent Gaussian data, and thereby allowing for the attractive properties of the Gaussian distribution such as partial correlations from the inverse covariance matrix to be utilised. In an extensive simulation study, the proposed model is evaluated against alternative models, where the proposed model is better able to recover the true underlying graph structure for different groups. Finally, the proposed model is applied on real production-ecological data pertaining to on-farm maize yield in order to showcase the added value of the proposed method in generating new hypotheses for production ecologists.
翻译:本条提议了一个能够处理混合类型、多组数据的图形模型。这种模型的动机来自真实世界观测数据,这些数据往往包含在空间和时间上不同条件下获得的样本群,有可能造成各组之间网络结构的差异。因此,i.d.假设不现实,并且将所有数据结果的单一图形模型安装在一个不能准确代表群体差异的网络中。此外,在广泛的模拟研究中,对真实世界观测数据通常采用混合类型,这违反了图形模型典型的Gaussian假设,该假设导致模型无法充分恢复基本图形结构。拟议模型考虑到这些数据的这些特性,将观测到的数据作为变形的潜层高斯数据处理,从而允许高斯分布具有吸引力的特性,例如与无法准确反映各组间差异的反常变矩阵部分相关性。在一项广泛的模拟研究中,根据替代模型对拟议模型进行了评估,在替代模型中,拟议模型能够更好地恢复不同组间真实的基本图形结构。最后,拟议的模型将数据属性考虑在内,将实际生产生态学模型用于在新的农产生产中增加电子模型,从而生成新的农业数据。