Federated learning is an appealing framework for analyzing sensitive data from distributed health data networks due to its protection of data privacy. Under this framework, data partners at local sites collaboratively build an analytical model under the orchestration of a coordinating site, while keeping the data decentralized. However, existing federated learning methods mainly assume data across sites are homogeneous samples of the global population, hence failing to properly account for the extra variability across sites in estimation and inference. Drawing on a multi-hospital electronic health records network, we develop an efficient and interpretable tree-based ensemble of personalized treatment effect estimators to join results across hospital sites, while actively modeling for the heterogeneity in data sources through site partitioning. The efficiency of our method is demonstrated by a study of causal effects of oxygen saturation on hospital mortality and backed up by comprehensive numerical results.
翻译:联邦学习是分析分布式卫生数据网络敏感数据的诱人框架,因为它保护了数据隐私。在这个框架下,当地的数据伙伴在协调点的管弦下合作建立一个分析模型,同时保持数据分散。但是,现有的联邦学习方法主要假设各站点的数据是全球人口同质样本,因此在估计和推论中不能适当说明不同站点之间的额外变异。我们利用多医院电子健康记录网络,开发了一个高效和可解释的基于树的基于个人化治疗效果的树类集合,用于在医院各站点取得结果,同时通过站点分割积极模拟数据源的异质性。我们方法的效率表现在对医院死亡率氧饱和全面数字结果支持的因果关系研究中。