During recent years, mediation analysis has become increasingly popular in many research fields. Basically, the aim of mediation analysis is to investigate the direct effect of exposure on outcome together with indirect effects along the pathways from exposure to outcome. There has been a great number of articles that applied mediation analysis to data from hundreds or thousands of individuals. With the rapid development of technology, the volume of avaliable data increases exponentially, which brings new challenges to researchers. It is often computationally infeasible to directly conduct statistical analysis for large datasets. However, there are very few results on mediation analysis with massive data. In this paper, we propose to use the subsampled double bootstrap as well as divide-and-conquer algorithm to perform statistical mediation analysis for large-scale dataset. Extensive numerical simulations are conducted to evaluate the performance of our method. Two real data examples are also provided to illustrate the usefulness of our approach in practical application.
翻译:近年来,调解分析在许多研究领域越来越受欢迎,基本上,调解分析的目的是调查接触结果的直接影响以及接触结果的间接影响。大量文章将调解分析应用于来自成千上万个人的数据。随着技术的迅速发展,可证实的数据量成倍增长,给研究人员带来了新的挑战。对大型数据集直接进行统计分析往往在计算上不可行。然而,利用大量数据进行调解分析的结果很少。在本文件中,我们提议使用分包的双层靴子以及鸿沟和对照算法来进行大规模数据集的统计调解分析。进行了广泛的数字模拟,以评价我们方法的绩效。还提供了两个真实数据实例,以说明我们在实际应用方面的做法的有用性。