XGBoost is one of the most widely used machine learning models in the industry due to its superior learning accuracy and efficiency. Targeting at data isolation issues in the big data problems, it is crucial to deploy a secure and efficient federated XGBoost (FedXGB) model. Existing FedXGB models either have data leakage issues or are only applicable to the two-party setting with heavy communication and computation overheads. In this paper, a lossless multi-party federated XGB learning framework is proposed with a security guarantee, which reshapes the XGBoost's split criterion calculation process under a secret sharing setting and solves the leaf weight calculation problem by leveraging distributed optimization. Remarkably, a thorough analysis of model security is provided as well, and multiple numerical results showcase the superiority of the proposed FedXGB compared with the state-of-the-art models on benchmark datasets.
翻译:XGBoost是该行业最广泛使用的机器学习模式之一,因为它的学习准确性和效率较高。针对大数据问题中的数据孤立问题,必须部署安全有效的联合XGBoost(FedXGB)模型。现有的FedXGB模型要么存在数据泄漏问题,要么只适用于具有大量通信和计算间接费用的两方环境。本文提出了一个无损的多党联合XGB学习框架,并附有一项安全担保,在秘密共享设置下重塑XGBoost的分离标准计算程序,并通过利用分布式优化解决叶片重量计算问题。值得注意的是,还提供了对模型安全的透彻分析,并提供了多个数字结果,展示了拟议的FDXGB相对于基准数据集最新模型的优势。