Ever since the seminal work of R. A. Fisher and F. Yates, factorial designs have been an important experimental tool to simultaneously estimate the treatment effects of multiple factors. In factorial designs, the number of treatment levels may grow exponentially with the number of factors, which motivates the forward screening strategy based on the sparsity, hierarchy, and heredity principles for factorial effects. Although this strategy is intuitive and has been widely used in practice, its rigorous statistical theory has not been formally established. To fill this gap, we establish design-based theory for forward factor screening in factorial designs based on the potential outcome framework. We not only prove its consistency property but also discuss statistical inference after factor screening. In particular, with perfect screening, we quantify the advantages of forward screening based on asymptotic efficiency gain in estimating factorial effects. With imperfect screening in higher-order interactions, we propose two novel strategies and investigate their impact on subsequent inference. Our formulation differs from the existing literature on variable selection and post-selection inference because our theory is based solely on the physical randomization of the factorial design and does not rely on a correctly-specified outcome model.
翻译:自R.A.Fisher和F.Yates的开创性工作以来,要素设计一直是一个重要的实验工具,可以同时估计多种因素的治疗效果。在要素设计中,治疗水平可能随着因素的数量而成倍增长,这些因素促使根据宽度、等级和因素效应的遗传性原则采取前瞻性筛选战略。虽然这一战略直观而且在实践中得到广泛应用,但严格统计理论尚未正式确立。为填补这一空白,我们建立了基于设计理论的理论,用于根据潜在结果框架对要素设计进行前期要素筛选。我们不仅证明了其一致性属性,而且还讨论了要素筛选后的统计推论。特别是,通过完善的筛选,我们量化了在估计要素效应中基于零度效率收益的前瞻性筛选的优势。我们提出了两项新战略,并调查其对随后推论的影响。我们的提法与关于变量选择和选择后推论的现有文献不同,因为我们的理论仅仅基于要素设计的实际随机化,而没有依赖正确的模型结果。