Inference centers need more data to have a more comprehensive and beneficial learning model, and for this purpose, they need to collect data from data providers. On the other hand, data providers are cautious about delivering their datasets to inference centers in terms of privacy considerations. In this paper, by modifying the structure of the autoencoder, we present a method that manages the utility-privacy trade-off well. To be more precise, the data is first compressed using the encoder, then confidential and non-confidential features are separated and uncorrelated using the classifier. The confidential feature is appropriately combined with noise, and the non-confidential feature is enhanced, and at the end, data with the original data format is produced by the decoder. The proposed architecture also allows data providers to set the level of privacy required for confidential features. The proposed method has been examined for both image and categorical databases, and the results show a significant performance improvement compared to previous methods.
翻译:推理中心需要更多数据来构建更全面和有效的学习模型,为此,它们需要从数据提供者那里收集数据。然而,数据提供者在隐私方面持谨慎态度,不愿把其数据集提供给推理中心。本文通过修正自编码器的结构,提出了一种良好管理效用-隐私权衡的方法。具体来说,数据首先使用编码器进行压缩,然后使用分类器将机密和非机密特征分离且不相关。机密特征适当地与噪声结合,非机密特征得到加强,最后,通过解码器产生与原始数据格式相同的数据。所提出的架构还允许数据提供者设置机密特征所需的隐私级别。已经对图像和分类数据库进行了测试,结果显示与以前方法相比,所提出的方法具有显著的性能改善。