Entity resolution is the task of disambiguating records that refer to the same entity in the real world. In this work, we explore adapting one of the most efficient and accurate Jaccard-based entity resolution algorithms - PPJoin, to the private domain via homomorphic encryption. Towards this, we present our precise adaptation of PPJoin (HE-PPJoin) that details certain subtle data structure modifications and algorithmic additions needed for correctness and privacy. We implement HE-PPJoin by extending the PALISADE homomorphic encryption library and evaluate over it for accuracy and incurred overhead. Furthermore, we directly compare HE-PPJoin against P4Join, an existing privacy-preserving variant of PPJoin which uses fingerprinting for raw content obfuscation, by demonstrating a rigorous analysis of the efficiency, accuracy, and privacy properties achieved by our adaptation as well as a characterization of those same attributes in P4Join.
翻译:实体的解决方案是取消与真实世界中同一实体相提并论的记录。 在这项工作中,我们探索如何通过同质加密将一个效率最高、最准确的以雅克卡为基础的实体解析算法 -- -- PPJoin(PPJoin)改换到私人领域。 为此,我们展示了我们对PPPJoin(He-PPJoin)的精确调整,详细介绍了为正确性和隐私所需的某些微妙的数据结构修改和算法添加。我们通过扩展PALISADE同质加密库来实施 ESP-PPJoin(HE-PPJoin), 并评估其准确性和间接费用。 此外,我们直接比较了HE-PPJoin (P4Join) 与 P4Join(P4Join) 的隐私保护变体(PPPPJoin) 的当前变体,PPPPJoin 使用指纹进行原始内容混淆, 展示了对我们适应所实现的效率、准确性和隐私特性的严格分析,以及在P4Join 中对这些特性的定性。