Negative control is a strategy for learning the causal relationship between treatment and outcome in the presence of unmeasured confounding. The treatment effect can nonetheless be identified if two auxiliary variables are available: a negative control treatment (which has no effect on the actual outcome), and a negative control outcome (which is not affected by the actual treatment). These auxiliary variables can also be viewed as proxies for a traditional set of control variables, and they bear resemblance to instrumental variables. I propose a family of algorithms based on kernel ridge regression for learning nonparametric treatment effects with negative controls. Examples include dose response curves, dose response curves with distribution shift, and heterogeneous treatment effects. Data may be discrete or continuous, and low, high, or infinite dimensional. I prove uniform consistency and provide finite sample rates of convergence. I estimate the dose response curve of cigarette smoking on infant birth weight adjusting for unobserved confounding due to household income, using a data set of singleton births in the state of Pennsylvania between 1989 and 1991.
翻译:负控制是一种在存在未观测混杂因素的情况下学习处理和结果之间因果关系的策略。如果存在两个辅助变量:负面控制处理(对实际结果没有影响)和负面控制结果(不受实际处理影响),则处理效应仍然可以被识别。这些辅助变量也可以被视为传统控制变量集合的代理,与工具变量相似。本文提出了一族基于核岭回归的算法,用于学习带有负控制的非参数处理效应。例子包括剂量反应曲线、剂量反应曲线与分布偏移以及异质性处理效应。数据可以是离散或连续的,可具有低、高或无限维度。证明了均匀一致的一致性和有限样本的收敛速率。作者运用宾夕法尼亚州1989年至1991年的单胎出生数据,调整家庭收入未观测混杂因素,估计了烟草吸食对婴儿出生体重的剂量反应曲线。