We consider the problem of learning treatment (or policy) rules that are externally valid in the sense that they have welfare guarantees in target populations that are similar to, but possibly different from, the experimental population. We allow for shifts in both the distribution of potential outcomes and covariates between the experimental and target populations. This paper makes two main contributions. First, we provide a formal sense in which policies that maximize social welfare in the experimental population remain optimal for the "worst-case" social welfare when the distribution of potential outcomes (but not covariates) shifts. Hence, policy learning methods that have good regret guarantees in the experimental population, such as empirical welfare maximization, are externally valid with respect to a class of shifts in potential outcomes. Second, we develop methods for policy learning that are robust to shifts in the joint distribution of potential outcomes and covariates. Our methods may be used with experimental or observational data.
翻译:我们认为,学习治疗(或政策)规则的问题在外部是有效的,因为它们在目标人群中具有与实验人口相似但可能与实验人口不同的福利保障;我们允许潜在结果的分配和实验人口与目标人口之间的共变变化;本文作出了两个主要贡献;首先,我们提供了一个正式的意义上,在潜在结果(但不包括共变)的分布时,使实验人口中的社会福利最大化的政策仍然最有利于“最坏情况”的社会福利;因此,在实验人口中具有良好遗憾保障的政策学习方法,例如经验福利最大化,对于潜在结果的转变类别而言,是外部有效的;其次,我们制定了政策学习方法,以有力地改变潜在结果和共变数的联合分布。我们的方法可以用于实验或观察数据。