The Causal Roadmap outlines a systematic approach to asking and answering questions of cause-and-effect: define quantity of interest, evaluate needed assumptions, conduct statistical estimation, and carefully interpret results. It is paramount that the algorithm for statistical estimation and inference be carefully pre-specified to optimize its expected performance for the specific real-data application. Simulations that realistically reflect the application, including key characteristics such as strong confounding and dependent or missing outcomes, can help us gain a better understanding of an estimator's applied performance. We illustrate this with two examples, using the Causal Roadmap and realistic simulations to inform estimator selection and full specification of the Statistical Analysis Plan. First, in an observational longitudinal study, outcome-blind simulations are used to inform nuisance parameter estimation and variance estimation for longitudinal targeted maximum likelihood estimation (TMLE). Second, in a cluster-randomized controlled trial with missing outcomes, treatment-blind simulations are used to ensure control for Type-I error in Two-Stage TMLE. In both examples, realistic simulations empower us to pre-specify an estimator that is expected to have strong finite sample performance and also yield quality-controlled computing code for the actual analysis. Together, this process helps to improve the rigor and reproducibility of our research.
翻译:暂无翻译