Obfuscating a dataset by adding random noises to protect the privacy of sensitive samples in the training dataset is crucial to prevent data leakage to untrusted parties for edge applications. We conduct comprehensive experiments to investigate how the dataset obfuscation can affect the resultant model weights - in terms of the model accuracy, Frobenius-norm (F-norm)-based model distance, and level of data privacy - and discuss the potential applications with the proposed Privacy, Utility, and Distinguishability (PUD)-triangle diagram to visualize the requirement preferences. Our experiments are based on the popular MNIST and CIFAR-10 datasets under both independent and identically distributed (IID) and non-IID settings. Significant results include a trade-off between the model accuracy and privacy level and a trade-off between the model difference and privacy level. The results indicate broad application prospects for training outsourcing in edge computing and guarding against attacks in Federated Learning among edge devices.
翻译:在培训数据集中添加随机噪音以保护敏感样品的隐私,从而模糊数据集,对于防止数据泄漏到不被信任的边缘应用方至关重要。我们进行全面实验,调查数据集混混如何影响由此得出的模型加权数――从模型准确性、基于F-Norm(F-Norm)的模型距离和数据保密程度等方面来看,并与拟议的隐私、功用和区分图(PUD)三角图讨论潜在应用,以直观地显示需求偏好。我们的实验以流行的MNIST和CIFAR-10数据集为基础,在独立和同样分布的(IID)和非IID设置下进行。重要的结果包括模型准确性和隐私水平之间的权衡,以及模型差异和隐私水平之间的权衡。结果显示,在边缘计算方面培训外包和在边缘设备之间防范联邦学习中的攻击方面应用了广泛的前景。