评估提供不同隐私的开放源码工具 (An Evaluation of Open-source Tools for the Provision of Differential Privacy)

The concept of differential privacy has widely penetrated academia and industry, with its formal guarantee on individual privacy that leads to compliances with privacy legislation, e.g., GDPR. However, there is a lack of understanding on tools capable of achieving differential privacy, and it is not clear what to expect from existing differential privacy tools when implementing privacy protection. Such an obstacle limits private applications' further prosperity. This paper reviews and evaluates the state-of-the-art open-source differential privacy tools of different domains using various estimating categories and privacy settings. Particularly, we look into the performances of three differential privacy tools for machine learning, two for statistical query, and four for synthetic data generation. We test all the tools on both continuous and categorical data and quantify their performance under different privacy budget and data size w.r.t. utility loss and system overhead. The accumulated evaluation results reveal several patterns that users can follow to optimally configure the tools, and provide preliminary guidelines on tool selection under different criteria. Finally, we openly release our evaluation coding repository, a framework that users can reuse to further evaluate the studied tools and beyond. We anticipate this work to provide a comprehensive insight into the performances of the existing dominant privacy tools, and a concrete reference for a potentially large developer community on private applications, thus narrowing the gap between conceptual differential privacy and private functionality development.

翻译：不同的隐私概念已广泛渗透到学术界和行业,其个人隐私正式保障导致遵守隐私立法,例如GDPR。然而,对于能够实现差异隐私的工具缺乏了解,在实行隐私保护时尚不清楚从现有的差异隐私工具中期待什么。这种障碍限制了私人应用的进一步繁荣。本文件审查和评估了不同领域的最新开放源码差异隐私工具,使用了不同的估计类别和隐私环境。特别是,我们研究了三种不同的机器学习隐私工具的性能,两个用于统计查询,四个用于合成数据生成。我们对所有工具进行连续和绝对数据测试,并在不同的隐私预算和数据大小(w.r.t. 公用事业损失和系统间接费用)下量化其性能。累积的评价结果揭示了用户可以遵循的几种模式,以优化工具配置,并为不同标准下的工具选择提供初步准则。最后,我们公开发布我们的评估编码储存库,一个用户可以再利用来进一步评估所研究的工具和以后的工具的框架。我们预计,这项工作将提供一个关于现有主要隐私权工具的绩效的大型洞察力,从而缩小现有私人隐私概念工具之间的潜在差距。

相关内容

TOOLS

关注 1

这个新版本的工具会议系列恢复了从1989年到2012年的50个会议的传统。工具最初是“面向对象语言和系统的技术”，后来发展到包括软件技术的所有创新方面。今天许多最重要的软件概念都是在这里首次引入的。2019年TOOLS 50+1在俄罗斯喀山附近举行，以同样的创新精神、对所有与软件相关的事物的热情、科学稳健性和行业适用性的结合以及欢迎该领域所有趋势和社区的开放态度，延续了该系列。官网链接：http://tools2019.innopolis.ru/

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

专知会员服务

15+阅读 · 2019年10月23日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日