Replication analysis is widely used in many fields of study. Once a research is published, many other researchers will conduct the same or very similar analysis to confirm the reliability of the published research. However, what if the data is confidential? In particular, if the data sets used for the studies are confidential, we cannot release the results of replication analyses to any entity without the permission to access the data sets, otherwise it may result in serious privacy leakage especially when the published study and replication studies are using similar or common data sets. For example, examining the influence of the treatment on outliers can cause serious leakage of the information about outliers. In this paper, we build two frameworks for replication analysis by a differentially private Bayesian approach. We formalize our questions of interest and illustrates the properties of our methods by a combination of theoretical analysis and simulation to show the feasibility of our approach. We also provide some guidance on the choice of parameters and interpretation of the results.
翻译:在许多研究领域广泛使用复制分析。一旦一项研究出版后,许多其他研究人员将进行同样或非常相似的分析,以确认已发表的研究的可靠性。然而,如果数据是保密的,则如何保密?特别是,如果用于研究的数据集是保密的,我们无法在没有获得数据集许可的情况下向任何实体公布复制分析的结果,否则,可能会造成严重隐私渗漏,特别是当已出版的研究和复制研究使用类似或共同的数据集时。例如,研究治疗对外部线的影响可能会导致关于外部线的信息严重渗漏。在本文件中,我们建立了两个框架,用差别化的私人贝叶西亚方法进行复制分析。我们通过理论分析和模拟的结合,将我们感兴趣的问题正式化,并展示我们方法的特性,以表明我们的方法的可行性。我们还就参数的选择和结果的解释提供一些指导。