Risk scoring systems have been widely deployed in many applications, which assign risk scores to users according to their behavior sequences. Though many deep learning methods with sophisticated designs have achieved promising results, the black-box nature hinders their applications due to fairness, explainability, and compliance consideration. Rule-based systems are considered reliable in these sensitive scenarios. However, building a rule system is labor-intensive. Experts need to find informative statistics from user behavior sequences, design rules based on statistics and assign weights to each rule. In this paper, we bridge the gap between effective but black-box models and transparent rule models. We propose a two-stage method, RuDi, that distills the knowledge of black-box teacher models into rule-based student models. We design a Monte Carlo tree search-based statistics generation method that can provide a set of informative statistics in the first stage. Then statistics are composed into logical rules with our proposed neural logical networks by mimicking the outputs of teacher models. We evaluate RuDi on three real-world public datasets and an industrial dataset to demonstrate its effectiveness.
翻译:风险评分系统在许多应用中被广泛采用,这些系统根据用户的行为顺序向用户分配风险分数。虽然许多深层次的、精密设计的学习方法已经取得了可喜的成果,但黑箱性质阻碍了其应用,因为公平、可解释性和合规性考虑。基于规则的系统在这些敏感情况下被认为是可靠的。然而,建立规则系统需要劳动密集型。专家们需要从用户行为顺序中寻找信息性统计数据,根据统计设计规则,并分配每项规则的权重。在本文中,我们弥合了有效但黑箱模型与透明规则模型之间的差距。我们提出了一个两阶段方法,即Rudi,将黑箱教师模型的知识注入基于规则的学生模型。我们设计了一个基于蒙特卡洛树的搜索生成统计数据方法,可以在第一阶段提供一套信息统计数据。然后,通过模拟教师模型的产出,将统计数据纳入我们提议的神经逻辑逻辑网络。我们用三个真实世界公共数据集和一个工业数据集对鲁迪进行了评估,以展示其有效性。