Integrating data from multiple heterogeneous sources has become increasingly popular to achieve a large sample size and diverse study population. This paper reviews development in causal inference methods that combines multiple datasets collected by potentially different designs from potentially heterogeneous populations. We summarize recent advances on combining randomized clinical trial with external information from observational studies or historical controls, combining samples when no single sample has all relevant variables with application to two-sample Mendelian randomization, distributed data setting under privacy concerns for comparative effectiveness and safety research using real-world data, Bayesian causal inference, and causal discovery methods.
翻译:将多种不同来源的数据综合起来,以获得大量抽样规模和不同研究人口,越来越受欢迎。本文审查了因果推断方法的发展情况,这些方法将潜在不同群体可能不同的设计所收集的多个数据集结合起来。我们总结了在将随机临床试验与观察研究或历史控制产生的外部信息相结合方面的最新进展。当没有单一抽样的样本拥有所有相关变量并应用两种样本的门德尔随机化、在隐私关注下通过真实世界数据、贝叶斯因果推断和因果发现方法在比较有效性和安全研究方面进行分布式数据设置。