Multivariate imputation by chained equations (MICE) is one of the most popular approaches to address missing values in a data set. This approach requires specifying a univariate imputation model for every variable under imputation. The specification of which predictors should be included in these univariate imputation models can be a daunting task. Principal component analysis (PCA) can simplify this process by replacing all of the potential imputation model predictors with a few components summarizing their variance. In this article, we extend the use of PCA with MICE to include a supervised aspect whereby information from the variables under imputation is incorporated into the principal component estimation. We conducted an extensive simulation study to assess the statistical properties of MICE with different versions of supervised dimensionality reduction and we compared them with the use of classical unsupervised PCA as a simpler dimensionality reduction technique.
翻译:暂无翻译