Selective inference (post-selection inference) is a methodology that has attracted much attention in recent years in the fields of statistics and machine learning. Naive inference based on data that are also used for model selection tends to show an overestimation, and so the selective inference conditions the event that the model was selected. In this paper, we develop selective inference in propensity score analysis with a semiparametric approach, which has become a standard tool in causal inference. Specifically, for the most basic causal inference model in which the causal effect can be written as a linear sum of confounding variables, we conduct Lasso-type variable selection by adding an $\ell_1$ penalty term to the loss function that gives a semiparametric estimator. Confidence intervals are then given for the coefficients of the selected confounding variables, conditional on the event of variable selection, with asymptotic guarantees. An important property of this method is that it does not require modeling of nonparametric regression functions for the outcome variables, as is usually the case with semiparametric propensity score analysis.
翻译:选择性推断(选择后推断)是近年来在统计和机器学习领域引起极大关注的一种方法。基于也用于模型选择的数据的预测性推断往往显示高估,因此选择性推断条件是模型选定的事件。在本文中,我们用半参数方法,在适应性评分分析中进行选择性推断(选择后推),这已成为因果推断的一个标准工具。具体地说,对于因果可以写成相近变量的线性和的最基本因果推论模型,我们通过在提供半参数估量的亏损函数中增加一个$\ell_1美元的惩罚性术语来选择Lasso型变量。然后对选定相近变量的系数设定了信任期,以变量选择的变数为条件,同时提供约束性保证。这一方法的一个重要特征是,对于结果变量的不要求建模非对准回归功能,正如通常采用半参数偏向偏差分分分分分数分析的情况那样。