For propensity score analysis and sparse estimation, we develop an information criterion for determining the regularization parameters needed in variable selection. First, for Gaussian distribution-based causal inference models, we extend Stein's unbiased risk estimation theory, which leads to a generalized Cp criterion that has almost no weakness in conventional sparse estimation, and derive an inverse-probability-weighted sparse estimation version of the criterion without resorting to asymptotics. Next, for general causal inference models that are not necessarily Gaussian distribution-based, we extend the asymptotic theory on LASSO for propensity score analysis, with the intention of implementing doubly robust sparse estimation. From the asymptotic theory, an AIC-type information criterion for inverse-probability-weighted sparse estimation is given, and then a criterion with double robustness in itself is derived for doubly robust sparse estimation. Numerical experiments compare the proposed criterion with the existing criterion derived from a formal argument and verify that the proposed criterion is superior in almost all cases, that the difference is not negligible in many cases, and that the results of variable selection differ significantly. Real data analysis confirms that the difference between variable selection and estimation by these criteria is actually large. Finally, generalizations to general sparse estimation using group LASSO, elastic net, and non-convex regularization are made in order to indicate that the proposed criterion is highly extensible.
翻译:对于偏差性评分分析和稀少估计,我们为确定变量选择所需的正规化参数制定了信息标准。首先,对于高山分布型因果推论模型,我们扩展了斯坦的公正风险估计理论,这导致普遍Cp标准,在常规稀释估计中几乎没有弱点,并得出了反概率加权的少估计标准,而没有采用无症状估算。接下来,对于不一定以高山分布为基础的一般因果推论模型,我们扩展了LASSO的倾向性评分分析的无症状理论,目的是执行加倍稳健的稀释估计。从无症状估算理论中,给出了AIC类反概率加权稀释估计的信息标准,然后得出了双重稳健性估算标准本身的双倍稳健性估算。对于从正式推论中得出的现有标准,数量实验将拟议的标准与现有标准进行比较,并核实在几乎所有情况下,差异是不可忽略的,目的是实施加倍稳健的稀释估计。从无症状的AIC类型推理学理论,而最后,采用可变性估算的标准则表明,采用较高的标准是高数值估。