Imputation and propensity score weighting are two popular techniques for handling missing data. We address these problems using the regularized M-estimation techniques in the reproducing kernel Hilbert space. Specifically, we first use the kernel ridge regression to develop imputation for handling item nonresponse. While this nonparametric approach is potentially promising for imputation, its statistical properties are not investigated in the literature. Under some conditions on the order of the tuning parameter, we first establish the root-$n$ consistency of the kernel ridge regression imputation estimator and show that it achieves the lower bound of the semiparametric asymptotic variance. A nonparametric propensity score estimator using the reproducing kernel Hilbert space is also developed by a novel application of the maximum entropy method for the density ratio function estimation. We show that the resulting propensity score estimator is asymptotically equivalent to the kernel ridge regression imputation estimator. Results from a limited simulation study are also presented to confirm our theory. The proposed method is applied to analyze the air pollution data measured in Beijing, China.
翻译:光量和偏度计加权是处理缺失数据的两种流行技术。我们使用复制内核Hilbert空间的常规M估计技术解决这些问题。具体地说,我们首先使用内核脊回归法来开发处理项目不响应的估算法。虽然这种非参数方法对估算有潜在希望,但其统计属性在文献中没有被调查。在调试参数排序的某些条件下,我们首先确定内核脊脊回归回归率估计仪的根值-n美元一致性,并显示其达到半参数性偏差的较低范围。使用再生产内核Hilbert空间的非对准性偏差估计仪也是通过对密度比率函数估计采用最大温性方法的新应用来开发的。我们显示由此得出的热度估计值与内核脊回归率估测值的同步值等同。在中国进行的有限模拟研究的结果也用来确认我们的污染理论。在北京进行的测量数据是用来分析空气的。