What is the value of an individual model in an ensemble of binary classifiers? We answer this question by introducing a class of transferable utility cooperative games called \textit{ensemble games}. In machine learning ensembles, pre-trained models cooperate to make classification decisions. To quantify the importance of models in these ensemble games, we define \textit{Troupe} -- an efficient algorithm which allocates payoffs based on approximate Shapley values of the classifiers. We argue that the Shapley value of models in these games is an effective decision metric for choosing a high performing subset of models from the ensemble. Our analytical findings prove that our Shapley value estimation scheme is precise and scalable; its performance increases with size of the dataset and ensemble. Empirical results on real world graph classification tasks demonstrate that our algorithm produces high quality estimates of the Shapley value. We find that Shapley values can be utilized for ensemble pruning, and that adversarial models receive a low valuation. Complex classifiers are frequently found to be responsible for both correct and incorrect classification decisions.
翻译:个人模型在二进制分类器组合中的价值是什么? 我们通过引入一个称为\ textit{ commenble game} 的可转让实用合作游戏类别来回答这个问题。 在机器学习组合中,预先培训的模型合作做出分类决定。为了量化模型在这些组合游戏中的重要性,我们定义了\ textit{ Troupe} -- -- 一种基于分类器的粗略值分配报酬的高效算法。我们争辩说,这些游戏中模型的损耗值是从组合中选择高性能一组模型的有效决定指标。我们的分析结论证明,我们的精度价值估计方案是准确和可缩放的;其性能随着数据集和组合的大小而提高。真实世界图表分类任务的经验性结果表明,我们的算法可以产生高质量的沙普利值估计值。我们发现,可使用沙普利值来计算组合,而敌对模型得到低值的估价。 复杂分类者往往对正确和不正确的分类决定负责。