Federated Learning enables a population of clients, working with a trusted server, to collaboratively learn a shared machine learning model while keeping each client's data within its own local systems. This reduces the risk of exposing sensitive data, but it is still possible to reverse engineer information about a client's private data set from communicated model parameters. Most federated learning systems therefore use differential privacy to introduce noise to the parameters. This adds uncertainty to any attempt to reveal private client data, but also reduces the accuracy of the shared model, limiting the useful scale of privacy-preserving noise. A system can further reduce the coordinating server's ability to recover private client information, without additional accuracy loss, by also including secure multiparty computation. An approach combining both techniques is especially relevant to financial firms as it allows new possibilities for collaborative learning without exposing sensitive client data. This could produce more accurate models for important tasks like optimal trade execution, credit origination, or fraud detection. The key contributions of this paper are: We present a privacy-preserving federated learning protocol to a non-specialist audience, demonstrate it using logistic regression on a real-world credit card fraud data set, and evaluate it using an open-source simulation platform which we have adapted for the development of federated learning systems.
翻译:联邦学习联合会使客户群能够与受信任的服务器合作,共同学习一个共享的机器学习模式,同时将每个客户的数据保存在自己的本地系统内。这减少了暴露敏感数据的风险,但是仍然有可能将客户的私人数据集信息从传送的模型参数中逆转。因此,大多数联邦学习系统使用不同的隐私来给参数带来噪音。这增加了任何披露私人客户数据的尝试的不确定性,但也降低了共享模式的准确性,限制了隐私保护噪音的有用规模。一个系统还可以进一步降低协调服务器在不增加准确性损失的情况下恢复私人客户信息的能力。一种结合这两种技术的方法对于金融公司来说特别相关,因为它允许在不披露敏感客户数据的情况下进行新的合作学习的可能性。这可以为最佳贸易执行、信用来源或欺诈检测等重要任务提供更准确的模式。 这份文件的主要贡献是:我们向非专业受众展示了一种隐私保护节能的节能学习协议,以显示它使用真实世界信用卡欺诈数据集的逻辑回归能力,而无需额外的准确性损失,还包括安全的多方计算。