The semiparametric estimation approach, which includes inverse-probability-weighted and doubly robust estimation using propensity scores, is a standard tool in causal inference, and it is rapidly being extended in various directions. On the other hand, although model selection is indispensable in statistical analysis, an information criterion for selecting an appropriate regression structure has just started to be developed. In this paper, based on the original definition of Akaike information criterion (AIC; \citealt{Aka73}), we derive an AIC-type criterion for propensity score analysis. Here, we define a risk function based on the Kullback-Leibler divergence as the cornerstone of the information criterion and treat a general causal inference model that is not necessarily a linear one. The causal effects to be estimated are those in the general population, such as the average treatment effect on the treated or the average treatment effect on the untreated. In light of the fact that this field attaches importance to doubly robust estimation, which allows either the model of the assignment variable or the model of the outcome variable to be wrong, we make the information criterion itself doubly robust so that either one can be wrong and it will still be an asymptotically unbiased estimator of the risk function. In simulation studies, we compare the derived criterion with an existing criterion obtained from a formal argument and confirm that the former outperforms the latter. Specifically, we check that the divergence between the estimated structure from the derived criterion and the true structure is clearly small in all simulation settings and that the probability of selecting the true or nearly true model is clearly higher. Real data analyses confirm that the results of variable selection using the two criteria differ significantly.
翻译:半参数估算方法包括反概率加权和双倍强估值,它使用偏差评分,是因果推断的一个标准工具,而且正在迅速向不同方向扩展。另一方面,虽然模型选择在统计分析中是不可或缺的,但选择适当回归结构的信息标准刚刚开始开发。在本文中,根据Akaike信息标准的最初定义(AIC;\citalt{Aka73}),我们为偏差评分分析得出一个 AIC 类型标准。在这里,我们根据 Kullback- Leibeller 结构选择风险设置一个风险函数,作为信息标准的基石,并处理一个不一定线性的一般因果推断模型。要估计的因果关系是一般人口,例如对治疗的平均处理效果或对未经处理者的平均处理效果。鉴于这个字段重视更精确的估算,使得分配变量的模型或结果变差的模型成为错误的模型,我们从信息标准中推算的概率和结果的模型之间,我们将得出一个准确的精确标准,因此,我们用一个真实的校正标准来进行精确的校验标准。