We demonstrate that the forecasting combination puzzle is a consequence of the methodology commonly used to produce forecast combinations. By the combination puzzle, we refer to the empirical finding that predictions formed by combining multiple forecasts in ways that seek to optimize forecast performance often do not out-perform more naive, e.g. equally-weighted, approaches. In particular, we demonstrate that, due to the manner in which such forecasts are typically produced, tests that aim to discriminate between the predictive accuracy of competing combination strategies can have low power, and can lack size control, leading to an outcome that favours the naive approach. We show that this poor performance is due to the behavior of the corresponding test statistic, which has a non-standard asymptotic distribution under the null hypothesis of no inferior predictive accuracy, rather than the {standard normal distribution that is} {typically adopted}. In addition, we demonstrate that the low power of such predictive accuracy tests in the forecast combination setting can be completely avoided if more efficient estimation strategies are used in the production of the combinations, when feasible. We illustrate these findings both in the context of forecasting a functional of interest and in terms of predictive densities. A short empirical example {using daily financial returns} exemplifies how researchers can avoid the puzzle in practical settings.
翻译:暂无翻译