Envelope method was recently proposed as a method to reduce the dimension of responses in multivariate regressions. However, when there exists missing data, the envelope method using the complete case observations may lead to biased and inefficient results. In this paper, we generalize the envelope estimation when the predictors and/or the responses are missing at random. Specifically, we incorporate the envelope structure in the expectation-maximization (EM) algorithm. As the parameters under the envelope method are not pointwise identifiable, the EM algorithm for the envelope method was not straightforward and requires a special decomposition. Our method is guaranteed to be more efficient, or at least as efficient as, the standard EM algorithm. Moreover, our method has the potential to outperform the full data MLE. We give asymptotic properties of our method under both normal and non-normal cases. The efficiency gain over the standard EM is confirmed in simulation studies and in an application to the Chronic Renal Insufficiency Cohort (CRIC) study.
翻译:最近提出了信封方法,作为减少多变回归反应范围的方法。然而,当缺少数据时,使用完整案例观测的封套方法可能导致偏差和效率低下的结果。在本文中,当预测器和/或答复随机缺失时,我们将信封估计法加以概括。具体地说,我们将信封结构纳入预期最大化算法。由于信封方法下的参数不易识别,因此信封方法的EM算法并不简单,需要特殊解析。我们的方法保证效率更高,至少与标准的EM算法一样有效。此外,我们的方法有可能超越数据完整 MLE。我们在正常和非正常情况下都提供我们方法的无损特性。模拟研究和慢性Renal Infear Cohort(CRIC)研究的应用证实了标准EM的效率收益。