When missing values occur in multi-view data, all features in a view are likely to be missing simultaneously. This leads to very large quantities of missing data which, especially when combined with high-dimensionality, makes the application of conditional imputation methods computationally infeasible. We introduce a new meta-learning imputation method based on stacked penalized logistic regression (StaPLR), which performs imputation in a dimension-reduced space. We evaluate the new imputation method with several imputation algorithms using simulations. The results show that meta-level imputation of missing values leads to good results at a much lower computational cost, and makes the use of advanced imputation algorithms such as missForest and predictive mean matching possible in settings where they would otherwise be computationally infeasible.
翻译:当多视图数据中出现缺失值时, 视图中的所有特征都有可能同时丢失。 这导致大量缺失数据, 特别是当与高维度相结合时, 使得有条件估算方法的应用在计算上是行不通的。 我们引入了一种新的基于堆叠的、 惩罚性后勤回归的元学习估算法( StaPLR ), 该方法在降低维度的空间中进行估算。 我们用一些模拟估算算法来评估新的估算法。 结果表明, 缺失值的元水平估算法导致以低得多的计算成本取得良好结果, 并使得使用先进的估算算法( 如误差和预测平均值等), 在其他情况下, 无法进行计算。