The concept of variation explained is widely used to assess the relevance of factors in the analysis of variance. In the linear model, it is the main contribution to the coefficient of determination which is widely used to assess the proportion of variation explained, to determine model goodness-of-fit and to compare models with different covariables. There has not been a consensus on a similar concept of explained variation for the class of linear mixed models yet. Based on the restricted maximum likelihood equations, we prove a full decomposition of the sum of squares of the dependent variable in the context of the variance components form of the linear mixed model. This decomposition is dimensionless relative to the variation of the dependent variable, has an intuitive and simple definition in terms of variance explained, is additive for several random effects and reduces to the decomposition in the linear model. Our result leads us to propose a natural extension of the well-known adjusted coefficient of determination to the linear mixed model. To this end, we introduce novel measures for the explained variation which we allocate to specific contributions of covariates associated with fixed and random effects. These partial explained variations constitute easily interpretable quantities, quantifying relevance of covariates associated with both fixed and random effects on a common scale, and thus allowing to rank their importance. Our approach is made readily available in the user-friendly $R$-package ``explainedVariation''. We illustrate its usefulness in two public low-dimensional datasets as well as in two high-dimensional datasets in the context of genome-wide association studies.
翻译:解释差异的概念被广泛用来评估差异分析中各种因素的相关性。在线性模型中,它是确定系数的主要贡献,广泛用于评估差异比例,确定模型是否适合,并将模型与不同的共变模型进行比较。对于线性混合模型类别解释差异的类似概念,尚未达成共识。根据限制的最大可能性方程式,我们证明完全分解了线性混合模型差异组成部分形式差异中依赖变量方方位之和。这种分解与依赖性变量的变化相比是无维度的,在差异方面有一个直观和简单的定义,对若干随机效应起补充作用,并减少线性模型中的分解。根据已知的经调整的确定系数与线性混合模型自然延伸。为此,我们为解释性变量的正方方位之和,我们分配与固定和随机效应有关的具体交织贡献。这些部分解释性背景在可解释性数值方面是容易理解的,因此,在可任意性数据比值方面,可量化的通用比值是其任意性,因此,使我们的比值具有共同作用。