共同或单独:合作学习中的隐私价格 (Together or Alone: The Price of Privacy in Collaborative Learning)

Machine Learning is a widely-used method for prediction generation. These predictions are more accurate when the model is trained on a larger dataset. On the other hand, the data is usually divided amongst different entities. For privacy reasons, the training can be done locally and then the model can be safely aggregated amongst the participants. However, if there are only two participants in \textit{Collaborative Learning}, the safe aggregation loses its power since the output of the training already contains much information about the participants. To resolve this issue, they must employ privacy-preserving mechanisms, which inevitably affect the accuracy of the model. In this paper, we model the training process as a two-player game where each player aims to achieve a higher accuracy while preserving its privacy. We introduce the notion of \textit{Price of Privacy}, a novel approach to measure the effect of privacy protection on the accuracy of the model. We develop a theoretical model for different player types, and we either find or prove the existence of a Nash Equilibrium with some assumptions. Moreover, we confirm these assumptions via a Recommendation Systems use case: for a specific learning algorithm, we apply three privacy-preserving mechanisms on two real-world datasets. Finally, as a complementary work for the designed game, we interpolate the relationship between privacy and accuracy for this use case and present three other methods to approximate it in a real-world scenario.

翻译：机器学习是一种广泛使用的预测生成方法。当模型在更大的数据集上培训时, 这些预测会更加精确。另一方面, 数据通常由不同的实体分割。出于隐私原因, 培训可以在本地进行, 然后模型可以安全地在参与者中汇总。但是, 如果在\ textit{Collabosual Learning} 中只有两名参与者, 安全集合会失去它的力量, 因为培训的产出已经包含关于参与者的大量信息。为了解决这个问题, 它们必须使用隐私保护机制, 这不可避免地影响到模型的准确性。在本文中, 我们将培训进程建为双玩游戏游戏游戏, 每个玩家都是为了在保护隐私的同时实现更高的准确性。我们引入了\ textitit{Prity} 的概念, 这是衡量隐私保护对模型准确性的影响的一种新颖的方法。我们为不同玩家类型开发了一个理论模型, 我们要么找到或证明存在一个 Nash Equiliriium, 并且有一些假设。此外, 我们通过一个建议系统案例来确认这些假设: 为了学习真实的保密性游戏中的一种特定的精确性, 我们运用了两种方法, 我们使用一种真正的隐私关系, 来在目前设计中, 我们使用一种真实的精确性选择的三种方法, 我们使用一种真实的三种方法, 来使用一种真实性选择。