将可解释模型提炼成《人类可读法典》 (Distilling Interpretable Models into Human-Readable Code)

The goal of model distillation is to faithfully transfer teacher model knowledge to a model which is faster, more generalizable, more interpretable, or possesses other desirable characteristics. Human-readability is an important and desirable standard for machine-learned model interpretability. Readable models are transparent and can be reviewed, manipulated, and deployed like traditional source code. As a result, such models can be improved outside the context of machine learning and manually edited if desired. Given that directly training such models is difficult, we propose to train interpretable models using conventional methods, and then distill them into concise, human-readable code. The proposed distillation methodology approximates a model's univariate numerical functions with piecewise-linear curves in a localized manner. The resulting curve model representations are accurate, concise, human-readable, and well-regularized by construction. We describe a piecewise-linear curve-fitting algorithm that produces high-quality results efficiently and reliably across a broad range of use cases. We demonstrate the effectiveness of the overall distillation technique and our curve-fitting algorithm using three publicly available datasets COMPAS, FICO, and MSLR-WEB30K.

翻译：模型蒸馏的目的是忠实地将教师模型知识转让给一种更快、更普遍、更可解释或具有其他可取特征的模型; 人类可读性是机器学习模型解释性的一个重要和可取的标准; 可读性模型透明,可以像传统源代码一样加以审查、操纵和部署; 因此,可以在机器学习之外改进这些模型,如果需要的话,可以人工编辑; 鉴于直接培训这类模型是困难的,我们提议用传统方法来培训可解释的模型,然后用简单、易读的代码加以提炼; 提议的蒸馏方法以局部方式将模型的单象形数字功能与片线曲线曲线相近; 由此形成的曲线模型表示准确、简洁、可读、易于操作,并按结构进行正规化。我们描述了在各种使用案例中高效和可靠地产生高质量结果的细线调算法。我们展示了总体蒸馏技术的有效性,以及我们使用三种公开数据集COMAS、FICO和MSLR-WE30K的曲线校准算法。

相关内容

MoDELS

关注 43

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【KDD2020】基于节点-边缘协同解纠缠的可解释深图生成，Interpretable Deep Graph Generation with Node-edge Co-disentanglement

专知会员服务

32+阅读 · 2020年6月11日

【快讯】ICML 2020论文出炉，1088篇上榜，你的paper中了吗？

专知会员服务

52+阅读 · 2020年6月1日

【CVPR2020-浙江大学-阿里巴巴】深层知识迁移的深层归因图，DEPARA: Deep Attribution Graph for Deep Knowledge Transferability

专知会员服务

29+阅读 · 2020年4月17日

【CMU-Spring2020课程】离散微分几何15讲，Discrete Differential Geometry

专知会员服务

55+阅读 · 2020年3月26日