Protecting sensitive information against data exploiting attacks is an emerging research area in data mining. Over the past, several different methods have been introduced to protect individual privacy from such attacks while maximizing data-utility of the application. However, these existing techniques are not sufficient to effectively protect data owner privacy, especially in the scenarios that utilize visualizable data (e.g. images, videos) or the applications that require heavy computations for implementation. To address these problems, we propose a new dimension reduction-based method for privacy preservation. Our method generates dimension-reduced data for performing machine learning tasks and prevents a strong adversary from reconstructing the original data. We first introduce a theoretical approach to evaluate dimension reduction-based privacy preserving mechanisms, then propose a non-linear dimension reduction framework motivated by state-of-the-art neural network structures for privacy preservation. We conducted experiments over three different face image datasets (AT&T, YaleB, and CelebA), and the results show that when the number of dimensions is reduced to seven, we can achieve the accuracies of 79%, 80%, and 73% respectively and the reconstructed images are not recognizable to naked human eyes.
翻译:保护敏感信息免遭利用数据进行攻击是一个新兴的研究领域。在过去,在尽量扩大应用数据用途的同时,为保护个人隐私免遭这种攻击,采用了几种不同的方法。然而,这些现有技术不足以有效保护数据所有人隐私,特别是在使用可视数据(例如图像、视频)或需要大量计算才能实施的应用程序的情况下。为了解决这些问题,我们提出了一个新的基于减少维度的隐私保护方法。我们的方法为完成机器学习任务生成了减少维度的数据,并防止了重建原始数据的强大对手。我们首先采用了理论方法来评价以降低维度为基础的隐私保护机制,然后提出了一个非线性维度减少框架,其动机是保护隐私的最先进的神经网络结构。我们在三个不同的图像数据集(AT&T、YeleB和CelibA)上进行了实验,结果显示,当维度减少到7个时,我们可以分别达到79%、80%和73%的隐蔽度,再造图像无法被裸视。