We consider the extent to which we can learn from a completely randomized experiment whether everyone has treatment effects that are weakly of the same sign, a condition we call monotonicity. From a classical sampling perspective, it is well-known that monotonicity is untestable. By contrast, we show from the design-based perspective -- in which the units in the population are fixed and only treatment assignment is stochastic -- that the distribution of treatment effects in the finite population (and hence whether monotonicity holds) is formally identified. We argue, however, that the usual definition of identification is unnatural in the design-based setting because it imagines knowing the distribution of outcomes over different treatment assignments for the same units. We thus evaluate the informativeness of the data by the extent to which it enables frequentist testing and Bayesian updating. We show that frequentist tests can have nontrivial power against some alternatives, but power is generically limited. Likewise, we show that there exist (non-degenerate) Bayesian priors that never update about whether monotonicity holds. We conclude that, despite the formal identification result, the ability to learn about monotonicity from data in practice is severely limited.
翻译:我们探讨了在完全随机化实验中,我们能在多大程度上判断所有个体的处理效应是否具有一致的符号(我们称之为单调性)。从经典抽样视角来看,单调性不可检验是众所周知的。相比之下,我们从基于设计的视角——即总体中的个体是固定的,仅处理分配是随机的——表明有限总体中处理效应的分布(进而单调性是否成立)在形式上是可识别的。然而,我们认为,在基于设计的设定中,通常的识别定义是不自然的,因为它假设我们知道相同个体在不同处理分配下结果的分布。因此,我们通过数据支持频率主义检验和贝叶斯更新的程度来评估其信息量。我们证明频率主义检验对某些备择假设具有非平凡的势,但势在一般情况下是有限的。同样,我们表明存在(非退化的)贝叶斯先验,在单调性是否成立的问题上永远不会更新。我们的结论是,尽管存在形式上的识别结果,但在实践中从数据中学习单调性的能力是严重受限的。