Big data is ubiquitous in practices, and it has also led to heavy computation burden. To reduce the calculation cost and ensure the effectiveness of parameter estimators, an optimal subset sampling method is proposed to estimate the parameters in marginal models with massive longitudinal data. The optimal subsampling probabilities are derived, and the corresponding asymptotic properties are established to ensure the consistency and asymptotic normality of the estimator. Extensive simulation studies are carried out to evaluate the performance of the proposed method for continuous, binary and count data and with four different working correlation matrices. A depression data is used to illustrate the proposed method.
翻译:暂无翻译