Model averaging (MA), a technique for combining estimators from a set of candidate models, has attracted increasing attention in machine learning and statistics. In the existing literature, there is an implicit understanding that MA can be viewed as a form of shrinkage estimation that draws the response vector towards the subspaces spanned by the candidate models. This paper explores this perspective by establishing connections between MA and shrinkage in a linear regression setting with multiple nested models. We first demonstrate that the optimal MA estimator is the best linear estimator with monotone non-increasing weights in a Gaussian sequence model. The Mallows MA, which estimates weights by minimizing the Mallows' $C_p$, is a variation of the positive-part Stein estimator. Motivated by these connections, we develop a novel MA procedure based on a blockwise Stein estimation. Our resulting Stein-type MA estimator is asymptotically optimal across a broad parameter space when the variance is known. Numerical results support our theoretical findings. The connections established in this paper may open up new avenues for investigating MA from different perspectives. A discussion on some topics for future research concludes the paper.
翻译:暂无翻译