Understanding of sample size, statistical power, and the accuracy and precision of the estimator in epidemiological research can facilitate power and bias analyses. However, such understanding can become complicated for several reasons. First, exposures varying spatiotemporally may be heteroskedastic. Second, distributed lags of exposures may be used to identify critical exposure time-windows. Third, exposure measurement error may exist, impacting the accuracy and/or precision of the estimator that consequently affects sample size and statistical power. Fourth, research may rely on different study designs, so understanding may differ. For example, case-crossover designs as matched case-control designs, are used to estimate health effects of short-term exposures. To address these gaps, I developed approximation equations for sample size, estimates of the estimators and standard errors, including polynomials for non-linear effect estimation. With air pollution exposure estimates, I examined approximations using statistical simulations. Overall, sample size, the accuracy and precision of the estimators can be approximated based on external information about validation, without validation data in hand. For distributed lags, approximations may perform well if residual confounding due to covariate measurement errors is not severe. This condition may be difficult to identify without validation data, so validation research is recommended in identifying critical exposure time-windows.
翻译:暂无翻译