We propose a principal components regression method based on maximizing a joint pseudo-likelihood for responses and predictors. Our method uses both responses and predictors to select linear combinations of the predictors relevant for the regression, thereby addressing an oft-cited deficiency of conventional principal components regression. The proposed estimator is shown to be consistent in a wide range of settings, including ones with non-normal and dependent observations; conditions on the first and second moments suffice if the number of predictors ($p$) is fixed and the number of observations ($n$) tends to infinity and dependence is weak, while stronger distributional assumptions are needed when $p \to \infty$ with $n$. We obtain the estimator's asymptotic distribution as the projection of a multivariate normal random vector onto a tangent cone of the parameter set at the true parameter, and find the estimator is asymptotically more efficient than competing ones. In simulations our method is substantially more accurate than conventional principal components regression and compares favorably to partial least squares and predictor envelopes. The method's practical usefulness is illustrated in a data example with cross-sectional prediction of stock returns.
翻译:我们提出一个主要组成部分回归法,其基础是最大限度地增加反应和预测者的共同假象。我们的方法是使用反应和预测器来选择与回归有关的预测者的线性组合,从而解决传统主要组成部分回归的反复缺失。我们提出的估计值显示,在一系列广泛的环境中是一致的,包括非正常和依赖观测的假设;如果预测者数目固定在第一和第二时刻,观察者数目倾向于不精确和依赖性,那么该第一和第二时刻的条件就足够了。在模拟我们的方法中,比常规主要组成部分回归要准确得多,而当用美元计算时,则需要更强有力的分布假设。我们获得估计者的随机分布,作为在真实参数设定的相向线上的多变量正常随机矢量的预测,并发现估计值比竞争者的效率要高得多。在模拟中,我们的方法比常规主要组成部分回归法要准确得多,并且比较优于部分的最小正方和预测值信封。我们获得天顶的天顶分布分布,我们得到天顶的分布,作为预测数据的多变量,在实际用途中以实例展示。