Two-phase designs measure variables of interest on a subcohort where the outcome and covariates are readily available or cheap to collect on all individuals in the cohort. Given limited resource availability, it is of interest to find an optimal design that includes more informative individuals in the final sample. We explore the optimal designs and efficiencies for analysis by design-based estimators. Generalized raking is an efficient design-based estimator that improves on the inverse-probability weighted (IPW) estimator by adjusting weights based on the auxiliary information. We derive a closed-form solution of the optimal design for estimating regression coefficients from generalized raking estimators. We compare it with the optimal design for analysis via the IPW estimator and other two-phase designs in measurement-error settings. We consider general two-phase designs where the outcome variable and variables of interest can be continuous or discrete. Our results show that the optimal designs for analysis by the two design-based estimators can be very different. The optimal design for IPW estimation is optimal for analysis via the IPW estimator and typically gives near-optimal efficiency for generalized raking, though we show there is potential improvement in some settings.
翻译:在结果和共变很容易或廉价收集到组群中所有个人时,两阶段设计可以衡量对亚焦点的兴趣变量。鉴于资源有限,我们有兴趣找到一种最佳设计,在最后样本中包括信息性更强的个人。我们探索了最佳设计,以便由基于设计的估测器进行分析。一般的重压是一种高效的基于设计的估计估算器,它通过根据辅助信息调整基于设计估计器的重量来改进反概率加权估计器(IPW)的测量器。我们从普遍测算器中得出一种用于估计回归系数的最佳设计封闭式解决方案。我们想通过IPW估测器和测量仪设置中的其他两阶段设计来将它与最佳分析设计进行比较。我们考虑的是一般的两阶段设计,其结果变量和利害变量可以连续或离散。我们的结果显示,两个基于设计估测器进行分析的最佳设计非常不同。通过通用估测器估算的IPW估计值的最佳设计是最佳的,通过IPWsestimator进行最合适的分析,我们一般地展示了某种可能性。