The development of technologies for causal inference with the privacy preservation of distributed data has attracted considerable attention in recent years. To address this issue, we propose a quasi-experiment based on data collaboration (DC-QE) that enables causal inference from distributed data with privacy preservation. Our method preserves the privacy of private data by sharing only dimensionality-reduced intermediate representations, which are individually constructed by each party. Moreover, our method can reduce both random errors and biases, whereas existing methods can only reduce random errors in the estimation of treatment effects. Through numerical experiments on both artificial and real-world data, we confirmed that our method can lead to better estimation results than individual analyses. With the spread of our method, intermediate representations can be published as open data to help researchers find causalities and accumulated as a knowledge base.
翻译:近些年来,开发因果推断技术,保护分发的数据的隐私,引起了相当大的关注。为了解决这一问题,我们提议基于数据协作的准实验(DC-QE),使从分发的数据中产生因果推断,并保护隐私。我们的方法通过只分享由各方单独设计的维度减少的中间表示来保护私人数据的隐私。此外,我们的方法可以减少随机错误和偏差,而现有的方法只能减少治疗效果估计中的随机错误。通过人工数据与现实世界数据的数字实验,我们确认我们的方法可以比个人分析产生更好的估计结果。随着我们方法的推广,中间表示可以作为公开数据公布,帮助研究人员发现因果关系,并积累成知识库。