Protecting the privacy of people whose data is used by machine learning algorithms is important. Differential Privacy is the appropriate mathematical framework for formal guarantees of privacy, and boosted decision trees are a popular machine learning technique. So we propose and test a practical algorithm for boosting decision trees that guarantees differential privacy. Privacy is enforced because our booster never puts too much weight on any one example; this ensures that each individual's data never influences a single tree "too much." Experiments show that this boosting algorithm can produce better model sparsity and accuracy than other differentially private ensemble classifiers.
翻译:保护数据被机器学习算法使用的人的隐私非常重要。 差异隐私是正式保障隐私的适当数学框架,而强化决策树是一种受欢迎的机器学习技术。 因此,我们提出并测试一种促进决策树的实用算法,它能保障差异隐私。 隐私之所以被强制实施是因为我们的助推器从未对任何一个例子给予过重的权重;这确保每个人的数据不会影响一棵“太大”的树。 实验显示,这种增强算法能够比其他差别化的私人共性分类者产生更好的模型宽度和准确性。