Model development often takes data structure, subject matter considerations, model assumptions, and goodness of fit into consideration. To diagnose issues with any of these factors, it can be helpful to understand regression model estimates at a more granular level. We propose a new method for decomposing point estimates from a regression model via weights placed on data clusters. The weights are informed only by the model specification and data availability and thus can be used to explicitly link the effects of data imbalance and model assumptions to actual model estimates. The weight matrix has been understood in linear models as the hat matrix in the existing literature. We extend it to Bayesian hierarchical regression models that incorporate prior information and complicated dependence structures through the covariance among random effects. We show that the model weights, which we call borrowing factors, generalize shrinkage and information borrowing to all regression models. In contrast, the focus of the hat matrix has been mainly on the diagonal elements indicating the amount of leverage. We also provide metrics that summarize the borrowing factors and are practically useful. We present the theoretical properties of the borrowing factors and associated metrics and demonstrate their usage in two examples. By explicitly quantifying borrowing and shrinkage, researchers can better incorporate domain knowledge and evaluate model performance and the impacts of data properties such as data imbalance or influential points.
翻译:模型的开发往往考虑到数据结构、主题因素、模型假设和适当性考虑。为了分析这些因素中的任何因素,在更细的颗粒水平上理解回归模型估计数。我们提出一种新的方法,通过对数据组的权重,从回归模型中分离点估计数;加权仅以模型规格和数据提供情况为依据,因此可用于将数据不平衡和模型假设的影响与实际模型估计数明确联系起来;加权矩阵作为现有文献中的帽子矩阵,在线性模型中被理解为帽子矩阵。我们将其扩展至巴伊西亚等级级回归模型,通过随机效应的相异性,将先前的信息和复杂的依赖结构纳入其中。我们提出一个新的方法,通过数据组群的权重,我们称之为借款因素,将缩缩缩缩缩图和信息借到所有回归模型中。相比之下,帽子矩阵的重点主要放在表明杠杆量的对等要素上,我们还提供了概括借款因素和相关的指标,并实际有用。我们用两个例子来说明借款因素和相关指标的理论特性,并展示其使用情况。我们把模型称为借债和缩缩放点,研究人员可以更好地将数据的影响纳入数据。