Model distillation has been a popular method for producing interpretable machine learning. It uses an interpretable "student" model to mimic the predictions made by the black box "teacher" model. However, when the student model is sensitive to the variability of the data sets used for training, the corresponded interpretation is not reliable. Existing strategies stabilize model distillation by checking whether a large enough corpus of pseudo-data is generated to reliably reproduce student models, but methods to do so have so far been developed for a specific student model. In this paper, we develop a generic approach for stable model distillation based on central limit theorem for the average loss. We start with a collection of candidate student models and search for candidates that reasonably agree with the teacher. Then we construct a multiple testing framework to select a corpus size such that the consistent student model would be selected under different pseudo sample. We demonstrate the application of our proposed approach on three commonly used intelligible models: decision trees, falling rule lists and symbolic regression. Finally, we conduct simulation experiments on Mammographic Mass and Breast Cancer datasets and illustrate the testing procedure throughout a theoretical analysis with Markov process.
翻译:模型蒸馏是生成可解释的机器学习的一种流行方法。 它使用一种可解释的“ 学生” 模型来模仿黑盒“ 教师” 模型所作的预测。 但是, 当学生模型对用于培训的数据集的变异性敏感时, 对应的解释是不可靠的。 现有的战略通过检查是否生成了足够多的伪数据来可靠复制学生模型来稳定模型蒸馏, 但迄今为止已经为特定学生模型开发了这样做的方法。 在本文中, 我们开发了一种基于平均损失中央限值的稳定模型蒸馏的通用方法。 我们从收集候选学生模型开始, 并寻找与教师合理一致的候选人。 然后我们建立一个多重测试框架, 以选择一个体积大小, 这样一致的学生模型就可以在不同伪样本中选择。 我们展示了我们提出的方法在三种常用的智能模型上的应用情况: 决策树、 规则列表的下降和象征性回归。 最后, 我们用Memgraphic质量和乳腺癌数据集进行模拟实验, 并在与Markov 过程的理论分析过程中说明测试程序。