Federated Learning (FL) has been an emerging trend in machine learning and artificial intelligence. It allows multiple participants to collaboratively train a better global model and offers a privacy-aware paradigm for model training since it does not require participants to release their original training data. However, existing FL solutions for vertically partitioned data or decision trees require heavy cryptographic operations. In this paper, we propose a framework named FederBoost for private federated learning of gradient boosting decision trees (GBDT). It supports running GBDT over both vertically and horizontally partitioned data. Vertical FederBoost does not require any cryptographic operation and horizontal FederBoost only requires lightweight secure aggregation. The key observation is that the whole training process of GBDT relies on the ordering of the data instead of the values. We fully implement FederBoost and evaluate its utility and efficiency through extensive experiments performed on three public datasets. Our experimental results show that both vertical and horizontal FederBoost achieve the same level of accuracy with centralized training, where all data are collected in a central server; and they are 4-5 orders of magnitude faster than the state-of-the-art solutions for federated decision tree training; hence offering practical solutions for industrial application.
翻译:联邦学习联合会(FL)是机器学习和人工智能方面新出现的趋势,它使多个参与者能够合作培训更好的全球模型,并为示范培训提供一个隐私意识范例,因为不需要参与者发布原始培训数据。然而,现有垂直分割数据或决策树的FL解决方案需要大量的加密操作。在本文件中,我们提出了一个名为FederBoost的框架,用于私人联合学习梯度增强决策树(GBDT),它支持在垂直和横向分割数据上运行GBDT。垂直和横向分割数据。垂直FederBoost不需要任何加密操作,横向FederBoost只需要轻量级安全汇总。关键观察是,GBDT的整个培训过程依赖于对数据而不是数值的订购。我们全面实施FederBoost,并通过在三个公共数据集上进行的广泛实验来评估其效用和效率。我们的实验结果表明,垂直和横向FederBoost都实现了同一程度的精度,所有数据都收集在中央服务器上;它们提供了4-5级的实际培训,以便更快地应用。