Factorial designs are widely used due to their ability to accommodate multiple factors simultaneously. The factor-based regression with main effects and some interactions is the dominant strategy for downstream data analysis, delivering point estimators and standard errors via one single regression. Justification of these convenient estimators from the design-based perspective requires quantifying their sampling properties under the assignment mechanism conditioning on the potential outcomes. To this end, we derive the sampling properties of the factor-based regression estimators from both saturated and unsaturated models, and demonstrate the appropriateness of the robust standard errors for the Wald-type inference. We then quantify the bias-variance trade-off between the saturated and unsaturated models from the design-based perspective, and establish a novel design-based Gauss--Markov theorem that ensures the latter's gain in efficiency when the nuisance effects omitted indeed do not exist. As a byproduct of the process, we unify the definitions of factorial effects in various literatures and propose a location-shift strategy for their direct estimation from factor-based regressions. Our theory and simulation suggest using factor-based inference for general factorial effects, preferably with parsimonious specifications in accordance with the prior knowledge of zero nuisance effects.
翻译:由于能够同时兼顾多种因素,因此广泛使用系数设计,因为它们能够同时兼顾多种因素。主要效果和一些相互作用的基于系数的回归是下游数据分析的主要策略,通过单一回归提供点估计和标准差错。从设计角度对这些方便估计者进行合理估计,需要根据潜在结果的外派机制对其抽样属性进行量化。为此,我们从饱和和和不饱和模型中取出基于系数的回归估计者的抽样特性,并表明Wald型推论的强标准差的恰当性能。然后,我们从设计角度量化饱和和和不饱和模型之间的偏差取舍取舍,并建立一个基于设计的新颖的基于设计估计的高斯-马尔科夫理论,以确保后者在以潜在结果为条件的外延效应确实不存在时,效率就会得到提高。作为这一过程的副产品,我们统一了各种文献中对因子效应的定义,并提出了从基于系数的回归中直接估计其位置变化的战略。我们理论和模拟了基于因素的精细性先变。我们用基于因素的精细的理论和模拟的精细度来建议,将精细的精细的精细性推。