The protection of user privacy is an important concern in machine learning, as evidenced by the rolling out of the General Data Protection Regulation (GDPR) in the European Union (EU) in May 2018. The GDPR is designed to give users more control over their personal data, which motivates us to explore machine learning frameworks for data sharing that do not violate user privacy. To meet this goal, in this paper, we propose a novel lossless privacy-preserving tree-boosting system known as SecureBoost in the setting of federated learning. SecureBoost first conducts entity alignment under a privacy-preserving protocol and then constructs boosting trees across multiple parties with a carefully designed encryption strategy. This federated learning system allows the learning process to be jointly conducted over multiple parties with common user samples but different feature sets, which corresponds to a vertically partitioned data set. An advantage of SecureBoost is that it provides the same level of accuracy as the non-privacy-preserving approach while at the same time, reveals no information of each private data provider. We show that the SecureBoost framework is as accurate as other non-federated gradient tree-boosting algorithms that require centralized data and thus it is highly scalable and practical for industrial applications such as credit risk analysis. To this end, we discuss information leakage during the protocol execution and propose ways to provably reduce it.
翻译:保护用户隐私是机器学习中的一个重要关切问题,2018年5月,欧洲联盟(欧盟)推出《一般数据保护条例》(GDPR)就证明了这一点。GGDPR旨在让用户对其个人数据有更大的控制权,这促使我们探索不侵犯用户隐私的数据共享的机器学习框架。为了实现这一目标,我们在本文件中提议建立一个名为“安全保护”的无损隐私植树促进系统,在Federal学习的设置中,该系统被称为“安全保护”系统,它首先根据隐私保护协议进行实体调整,然后用精心设计的加密战略在多个政党之间构建树。这个联合学习系统使得学习过程能够由拥有共同用户样本但有不同特性的多个政党联合进行,这与纵向分割数据集相对应。SecreatBoost的优势是,它提供了与非隐私保护方法相同的准确度。安全Boost首先没有显示每个私人数据提供者的信息。我们显示,安全Boost框架是准确的,作为其他非中央化的、可更新的、可更新的、可更新的、可更新的、可更新的、可升级的、可升级的、可转换的、可转换的、可转换的、可转换的、可转换的、可转换的、可转换的、可转换的、可转换的、可转换的、可转换的、可转换的、可转换的、可转换的、可转换的、可转换的、可转换的、可转换的、可转换的、可转换的、可转换的、可转换的、可转换的、可转换的、可转换的、可转换的、可转换的、可转换的、可转换的、可转换的、可转换的、可转换的、可转换的、可转换的、可转换的、可转换的、可转换的、可转换的、可转换的、可转换的、可转换的、可转换的、可转换的、可转换的、可转换的、可转换的、可转换的、可转换的、可操作的、可转换的、可操作的、可转换的、可操作的、可操作的、可转换的、可转换的、可转换的、可转换的、可转换的、可转换的、可转换的、可转换的、可转换的、可转换的、可转换的、可转换的、可转换的、可转换的、可转换