Traditionally, the least squares regression is mainly concerned with studying the effects of individual predictor variables, but strongly correlated variables generate multicollinearity which makes it difficult to study their effects. Existing methods for handling multicollinearity such as ridge regression are complicated. To resolve the multicollinearity issue without abandoning the simple least squares regression, for situations where predictor variables are in groups with strong within-group correlations but weak between-group correlations, we propose to study the effects of the groups with a group approach to the least squares regression. Using an all positive correlations arrangement of the strongly correlated variables, we first characterize group effects that are meaningful and can be accurately estimated. We then present the group approach with numerical examples and demonstrate its advantages over existing methods for handling multicollinearity. We also address a common misconception about prediction accuracy of the least squares estimated model and discuss through an example similar group effects in generalized linear models.
翻译:传统上,最小平方回归主要涉及研究单个预测变量的影响,但密切相关的变量产生多曲线性,因此难以研究其影响。现有处理多曲线性的方法,如山脊回归的方法十分复杂。为了在不放弃简单最小平方回归的情况下解决多曲线性问题,对于预测变量属于群体内部关联性强但群体之间关联性弱的情况,我们提议研究群体的影响,对最小平方回归采用集体方法。我们首先使用高度相关变量的所有正相关安排,确定具有实际意义并能准确估计的集团效应。我们然后用数字实例介绍小组方法,并展示其相对于处理多曲线性的现有方法的优势。我们还处理关于预测最小平方估计模型准确性的共同错误,并通过一个典型的通用线性模型的类似群体效应进行讨论。