Transparency of Machine Learning models used for decision support in various industries becomes essential for ensuring their ethical use. To that end, feature attribution methods such as SHAP (SHapley Additive exPlanations) are widely used to explain the predictions of black-box machine learning models to customers and developers. However, a parallel trend has been to train machine learning models in collaboration with other data holders without accessing their data. Such models, trained over horizontally or vertically partitioned data, present a challenge for explainable AI because the explaining party may have a biased view of background data or a partial view of the feature space. As a result, explanations obtained from different participants of distributed machine learning might not be consistent with one another, undermining trust in the product. This paper presents an Explainable Data Collaboration Framework based on a model-agnostic additive feature attribution algorithm (KernelSHAP) and Data Collaboration method of privacy-preserving distributed machine learning. In particular, we present three algorithms for different scenarios of explainability in Data Collaboration and verify their consistency with experiments on open-access datasets. Our results demonstrated a significant (by at least a factor of 1.75) decrease in feature attribution discrepancies among the users of distributed machine learning.
翻译:各个行业用于决策支持的机器透明学习模式(SHapley Additive Exposations)等特征归属方法,被广泛用于向客户和开发商解释黑箱机器学习模式的预测,然而,一个平行的趋势是与其他数据持有者合作培训机器学习模式,而没有获得数据,这些模型是经过横向或纵向分割数据培训的,对可解释的AI提出了挑战,因为解释方可能对背景数据有偏颇的看法,或者对特征空间有部分看法。因此,不同参与者对分布式机器学习的解释可能相互不一致,从而破坏对产品的信任。本文介绍了基于模型-敏感添加特性属性属性算法(KernelSHAP)和数据协作方法的机器保密分布式学习。特别是,我们为数据协作中解释性的不同假设提出了三种算法,并核实这些算法与公开访问数据集的实验是否一致。我们的结果显示,一个显著(至少1.75个因素)可以解释性的数据合作框架以模型-保密性特性属性差异在用户之间减少。