The increasing concerns about data privacy and security drive an emerging field of studying privacy-preserving machine learning from isolated data sources, i.e., federated learning. A class of federated learning, vertical federated learning, where different parties hold different features for common users, has a great potential of driving a more variety of business cooperation among enterprises in many fields. In machine learning, decision tree ensembles such as gradient boosting decision tree (GBDT) and random forest are widely applied powerful models with high interpretability and modeling efficiency. However, the interpretability is compromised in state-of-the-art vertical federated learning frameworks such as SecureBoost with anonymous features to avoid possible data breaches. To address this issue in the inference process, in this paper, we propose Fed-EINI to protect data privacy and allow the disclosure of feature meaning by concealing decision paths with a communication-efficient secure computation method for inference outputs. The advantages of Fed-EINI will be demonstrated through both theoretical analysis and extensive numerical results.
翻译:对数据隐私和安全的日益关切促使人们从孤立的数据来源(即联合学习)中学习隐私保存机器,这是一个新兴领域,从孤立的数据来源(即联合学习)中学习隐私保护机器。一类联谊学习、纵向联谊学习,不同当事方对共同用户具有不同的特征,这极有可能推动企业在许多领域开展更加多样的商业合作。在机器学习中,决策树集合,如梯度提振决策树和随机森林,被广泛采用具有高可解释性和建模效率的强大模型。然而,在最先进的纵向联谊学习框架中,如具有匿名特征的安全堡等安全联合学习框架中,解释性受到损害,从而避免可能发生数据破损。为了在推断过程中解决这一问题,我们在本文件中建议美联储-环境信息研究所保护数据隐私,允许通过隐匿决定路径,以通信高效的安全计算方法对推断产出进行解释。美联储的优势将通过理论分析和广泛的数字结果来证明。