Differential privacy provides a strong form of privacy and allows preserving most of the original characteristics of the dataset. Utilizing these benefits requires one to design specific differentially private data analysis algorithms. In this work, we present three tree-based algorithms for mining redescriptions while preserving differential privacy. Redescription mining is an exploratory data analysis method for finding connections between two views over the same entities, such as phenotypes and genotypes of medical patients, for example. It has applications in many fields, including some, like health care informatics, where privacy-preserving access to data is desired. Our algorithms are the first differentially private redescription mining algorithms, and we show via experiments that, despite the inherent noise in differential privacy, it can return trustworthy results even in smaller datasets where noise typically has a stronger effect.
翻译:不同隐私提供了一种强大的隐私形式,并允许保存数据集的大部分原始特征。利用这些好处需要设计具体的、差别化的私人数据分析算法。在这项工作中,我们提出了三种基于树的算法,用于采矿重新开业,同时保留不同的隐私。重新开业是一种探索性的数据分析方法,用于在相同实体的两种观点之间找到联系,例如个人类型和病人基因类型。它在许多领域都有应用,包括一些领域,例如保健信息学领域,需要保留隐私以获取数据。我们的算法是第一种差异化的私人重新开业算法,我们通过实验表明,尽管在不同的隐私中存在固有的噪音,它仍然可以在噪音通常具有较强效果的较小数据集中恢复可信赖的结果。</s>