We develop a model-based boosting approach for multivariate distributional regression within the framework of generalized additive models for location, scale, and shape. Our approach enables the simultaneous modeling of all distribution parameters of an arbitrary parametric distribution of a multivariate response conditional on explanatory variables, while being applicable to potentially high-dimensional data. Moreover, the boosting algorithm incorporates data-driven variable selection, taking various different types of effects into account. As a special merit of our approach, it allows for modelling the association between multiple continuous or discrete outcomes through the relevant covariates. After a detailed simulation study investigating estimation and prediction performance, we demonstrate the full flexibility of our approach in three diverse biomedical applications. The first is based on high-dimensional genomic cohort data from the UK Biobank, considering a bivariate binary response (chronic ischemic heart disease and high cholesterol). Here, we are able to identify genetic variants that are informative for the association between cholesterol and heart disease. The second application considers the demand for health care in Australia with the number of consultations and the number of prescribed medications as a bivariate count response. The third application analyses two dimensions of childhood undernutrition in Nigeria as a bivariate response and we find that the correlation between the two undernutrition scores is considerably different depending on the child's age and the region the child lives in.
翻译:在通用添加模型的框架内,我们为位置、比例和形状制定了基于模型的多变分布回归模型推动方法。我们的方法使我们能够同时对以解释变量为条件的多变反应的任意参数分布的所有分布参数进行模型模型化,同时适用于潜在的高维数据。此外,增强算法包含数据驱动的变量选择,考虑到各种不同的不同影响类型。作为我们的方法的一个特殊优点,它允许通过相关变量模拟多种连续或离散结果之间的联系。在详细模拟研究估计和预测性能之后,我们展示了我们在三种不同生物医学应用中采用的方法的充分灵活性。第一个基于英国生物库的高维度基因组数据,考虑的是双轨双轨双轨反应(慢性心脏病和高胆固醇)。在这里,我们能够确定遗传变量,这些变量能够为胆固醇和心脏病之间的联系提供信息。第二个应用程序将澳大利亚对保健的需求与咨询数量和处方药物数量视为一种双轨反应。根据英国生物库的高度组群数据,考虑双维的组合组合组合数据,考虑双维的双维反应,在尼日利亚儿童生命和两度反应下,我们发现儿童生命的两个不同层次下,在儿童生命中发现两个不同层次下,在儿童生命中发现两个阶段的统计中,在儿童中发现两个阶段的双重反应。