In a wide spectrum of real-world applications, it is very important to analyze and mine graph data such as social networks, communication networks, citation networks, and so on. However, the release of such graph data often raises privacy issue, and the graph privacy preservation has recently drawn much attention from the database community. While prior works on graph privacy preservation mainly focused on protecting the privacy of either the graph structure only or vertex attributes only, in this paper, we propose a novel mechanism for graph privacy preservation by considering attacks from both graph structures and vertex attributes, which transforms the original graph to a so-called kt-safe graph, via k-anonymity and t-closeness. We prove that the generation of a kt-safe graph is NP-hard, therefore, we propose a feasible framework for effectively and efficiently anonymizing a graph with low anonymization cost. In particular, we design a cost-model-based graph partitioning approach to enable our proposed divide-and-conquer strategy for the graph anonymization, and propose effective optimization techniques such as pruning method and a tree synopsis to improve the anonymization efficiency over large-scale graphs. Extensive experiments have been conducted to verify the efficiency and effectiveness of our proposed kt-safe graph generation approach on both real and synthetic data sets.
翻译:在一系列广泛的现实世界应用中,分析和采集图表数据非常重要,如社交网络、通信网络、引用网络等。然而,此类图表数据的发布往往引起隐私问题,而图表隐私保护最近引起了数据库界的极大关注。虽然先前的图形隐私保护工作主要侧重于仅保护图形结构的隐私或仅保护顶点属性的隐私,但在本文件中,我们提出了一个基于成本模型的图形分化方法,以便从图形结构和顶点属性两方面考虑攻击,从而将原始图表转换成所谓的 kt 安全图形,通过k- 匿名和T- 关闭。我们证明,生成Kt 安全图表是硬的,因此,我们提出了一个可行的框架,以便以低匿名成本对图表进行高效和高效的同声调。我们特别设计了一个基于成本模型的图形分化方法,以便我们提议的图形本地化分化分化分化战略能够将原始图表转换成所谓的 kt- 安全图形,通过k- 匿名性和 T- closeality 图形化等有效优化技术。我们提出的模型生成模型和合成数据系统效率的大规模测试方法已经改进了我们的模型生成系统化效率。