XtracTree:对零售银行业务中采用的勾销方法进行监管验证的简单有效方法 (XtracTree: a Simple and Effective Method for Regulator Validation of Bagging Methods Used in Retail Banking) - 专知论文

会员服务 ·

0

Bagging · ML · SimPLe · 机器学习建模 · 模型验证 ·

2021 年 8 月 17 日

XtracTree: a Simple and Effective Method for Regulator Validation of Bagging Methods Used in Retail Banking

翻译：XtracTree:对零售银行业务中采用的勾销方法进行监管验证的简单有效方法

Jeremy Charlier,Vladimir Makarenkov

Bootstrap aggregation, known as bagging, is one of the most popular ensemble methods used in machine learning (ML). An ensemble method is a ML method that combines multiple hypotheses to form a single hypothesis used for prediction. A bagging algorithm combines multiple classifiers modeled on different sub-samples of the same data set to build one large classifier. Banks, and their retail banking activities, are nowadays using the power of ML algorithms, including decision trees and random forests, to optimize their processes. However, banks have to comply with regulators and governance and, hence, delivering effective ML solutions is a challenging task. It starts with the bank's validation and governance department, followed by the deployment of the solution in a production environment up to the external validation of the national financial regulator. Each proposed ML model has to be validated and clear rules for every algorithm-based decision must be justified. In this context, we propose XtracTree, an algorithm capable of efficiently converting an ML bagging classifier, such as a random forest, into simple "if-then" rules satisfying the requirements of model validation. We use a public loan data set from Kaggle to illustrate the usefulness of our approach. Our experiments demonstrate that using XtracTree, one can convert an ML model into a rule-based algorithm, leading to easier model validation by national financial regulators and the bank's validation department. The proposed approach allowed our banking institution to reduce up to 50% the time of delivery of our AI solutions to the end-user.

翻译：套用算法结合了以同一数据集的不同子样本为模型的多个分类器,以构建一个大型分类器。银行及其零售银行活动现在使用ML算法的力量,包括决策树和随机森林来优化其流程。然而,银行必须遵守监管者和治理者,因此,提供有效的 ML 解决方案是一项艰巨的任务。从银行的验证和治理部门开始,然后在生产环境中部署解决方案,直到国家金融监管机构的外部验证。每一个拟议的 ML 模型都必须经过验证,每个基于算法的决定都必须有明确的规则。在这方面,我们提出Xtrac 模型,一种允许的算法,能够有效地将ML baging 分类器(例如随机的森林)转换成简单的“if-then”交付规则。我们用一个“XL ” 模型来验证我们的银行系统。我们用一个测试模型来展示我们的公共数据集,用一个工具来演示我们的银行的验证规则。

0

相关内容

Bagging

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

【SIGIR2020】学习词项区分性，Learning Term Discrimination

【SIGIR2020】学习词项区分性，Learning Term Discrimination

专知会员服务

16+阅读 · 2020年4月28日

随机特征核近似综述: 算法与理论，Random Features for Kernel Approximation: A Survey in Algorithms, Theory, and Beyond

随机特征核近似综述: 算法与理论，Random Features for Kernel Approximation: A Survey in Algorithms, Theory, and Beyond

专知会员服务

33+阅读 · 2020年4月26日

【CMU-TACL2020】低资源跨语言实体链接，Low-resource Crosslingual EntityLinking

专知会员服务

17+阅读 · 2020年3月29日

【Mila-Google】使用元学习动态调整源代码模型，On-the-Fly Adaptation of Source Code Models using Meta-Learning

【Mila-Google】使用元学习动态调整源代码模型，On-the-Fly Adaptation of Source Code Models using Meta-Learning

专知会员服务

21+阅读 · 2020年3月28日

《可解释的机器学习-interpretable-ml》238页pdf

《可解释的机器学习-interpretable-ml》238页pdf

专知会员服务

208+阅读 · 2020年2月24日

【机器学习教程】生物导体MLInterfaces包到基因表达数据的应用，applications of the BioconductorMLInterfaces package to gene expression data

【机器学习教程】生物导体MLInterfaces包到基因表达数据的应用，applications of the BioconductorMLInterfaces package to gene expression data

专知会员服务

18+阅读 · 2020年1月11日

【ECML-PKDD 2019】序列和时间序列学习的有效线性模型（Effective Linear Models for Learning with Sequences and Time Series），Georgiana Ifrim

【ECML-PKDD 2019】序列和时间序列学习的有效线性模型（Effective Linear Models for Learning with Sequences and Time Series），Georgiana Ifrim

专知会员服务

35+阅读 · 2019年12月1日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

LibRec 精选：推荐系统的论文与源码

LibRec 精选：推荐系统的论文与源码

LibRec智能推荐

14+阅读 · 2018年11月29日

利用动态深度学习预测金融时间序列基于Python

利用动态深度学习预测金融时间序列基于Python

量化投资与机器学习

18+阅读 · 2018年10月30日

【推荐】(Python)多种模型(Naive Bayes, SVM, CNN, LSTM, etc)实现推文情感分析

【推荐】(Python)多种模型(Naive Bayes, SVM, CNN, LSTM, etc)实现推文情感分析

机器学习研究会

13+阅读 · 2017年12月25日

分布式TensorFlow入门指南

分布式TensorFlow入门指南

机器学习研究会

4+阅读 · 2017年11月28日

LibRec 每周算法：Wide & Deep (by Google)

LibRec 每周算法：Wide & Deep (by Google)

LibRec智能推荐

9+阅读 · 2017年10月25日

【推荐】决策树/随机森林深入解析

【推荐】决策树/随机森林深入解析

机器学习研究会

5+阅读 · 2017年9月21日

【推荐】Python机器学习生态圈(Scikit-Learn相关项目)

【推荐】Python机器学习生态圈(Scikit-Learn相关项目)

机器学习研究会

6+阅读 · 2017年8月23日

Trading Data for Wind Power Forecasting: A Regression Market with Lasso Regularization

Arxiv

0+阅读 · 2021年10月14日

The Time Domain Linear Sampling Method for Determining the Shape of a Scatterer using Electromagnetic Waves

The Time Domain Linear Sampling Method for Determining the Shape of a Scatterer using Electromagnetic Waves

Arxiv

0+阅读 · 2021年10月14日

Preconditioners for robust optimal control problems under uncertainty

Preconditioners for robust optimal control problems under uncertainty

Arxiv

0+阅读 · 2021年10月14日

Integrated Path Planning and Tracking Control of Marine Current Turbine in Uncertain Ocean Environments

Arxiv

0+阅读 · 2021年10月14日

A Contraction Theory Approach to Optimization Algorithms from Acceleration Flows

Arxiv

0+阅读 · 2021年10月13日

On the Double Descent of Random Features Models Trained with SGD

Arxiv

0+阅读 · 2021年10月13日

Towards formalization and monitoring of microscopic traffic parameters using temporal logic

Arxiv

0+阅读 · 2021年10月12日

Worst-case Delay Bounds in Time-Sensitive Networks with Packet Replication and Elimination

Arxiv

0+阅读 · 2021年10月12日

Learning Multiple Stock Trading Patterns with Temporal Routing Adaptor and Optimal Transport

Arxiv

8+阅读 · 2021年6月25日

The Search Problem in Mixture Models

Arxiv

3+阅读 · 2018年2月24日

VIP会员

文章信息

相关主题

机器学习建模

相关VIP内容

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

【SIGIR2020】学习词项区分性，Learning Term Discrimination

【SIGIR2020】学习词项区分性，Learning Term Discrimination

专知会员服务

16+阅读 · 2020年4月28日

随机特征核近似综述: 算法与理论，Random Features for Kernel Approximation: A Survey in Algorithms, Theory, and Beyond

随机特征核近似综述: 算法与理论，Random Features for Kernel Approximation: A Survey in Algorithms, Theory, and Beyond

专知会员服务

33+阅读 · 2020年4月26日

【CMU-TACL2020】低资源跨语言实体链接，Low-resource Crosslingual EntityLinking

专知会员服务

17+阅读 · 2020年3月29日

【Mila-Google】使用元学习动态调整源代码模型，On-the-Fly Adaptation of Source Code Models using Meta-Learning

【Mila-Google】使用元学习动态调整源代码模型，On-the-Fly Adaptation of Source Code Models using Meta-Learning

专知会员服务

21+阅读 · 2020年3月28日

《可解释的机器学习-interpretable-ml》238页pdf

《可解释的机器学习-interpretable-ml》238页pdf

专知会员服务

208+阅读 · 2020年2月24日

【机器学习教程】生物导体MLInterfaces包到基因表达数据的应用，applications of the BioconductorMLInterfaces package to gene expression data

【机器学习教程】生物导体MLInterfaces包到基因表达数据的应用，applications of the BioconductorMLInterfaces package to gene expression data

专知会员服务

18+阅读 · 2020年1月11日

【ECML-PKDD 2019】序列和时间序列学习的有效线性模型（Effective Linear Models for Learning with Sequences and Time Series），Georgiana Ifrim

【ECML-PKDD 2019】序列和时间序列学习的有效线性模型（Effective Linear Models for Learning with Sequences and Time Series），Georgiana Ifrim

专知会员服务

35+阅读 · 2019年12月1日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

数据要素发展报告(2025年)：附下载

人工智能代理提升战时舰船战备水平

【NeurIPS2025教程】大语言模型规划

NeurIPS 2025 教程：深度学习训练不稳定性的理论洞见

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

LibRec 精选：推荐系统的论文与源码

LibRec 精选：推荐系统的论文与源码

LibRec智能推荐

14+阅读 · 2018年11月29日

利用动态深度学习预测金融时间序列基于Python

利用动态深度学习预测金融时间序列基于Python

量化投资与机器学习

18+阅读 · 2018年10月30日

【推荐】(Python)多种模型(Naive Bayes, SVM, CNN, LSTM, etc)实现推文情感分析

【推荐】(Python)多种模型(Naive Bayes, SVM, CNN, LSTM, etc)实现推文情感分析

机器学习研究会

13+阅读 · 2017年12月25日

分布式TensorFlow入门指南

分布式TensorFlow入门指南

机器学习研究会

4+阅读 · 2017年11月28日

LibRec 每周算法：Wide & Deep (by Google)

LibRec 每周算法：Wide & Deep (by Google)

LibRec智能推荐

9+阅读 · 2017年10月25日

【推荐】决策树/随机森林深入解析

【推荐】决策树/随机森林深入解析

机器学习研究会

5+阅读 · 2017年9月21日

【推荐】Python机器学习生态圈(Scikit-Learn相关项目)

【推荐】Python机器学习生态圈(Scikit-Learn相关项目)

机器学习研究会

6+阅读 · 2017年8月23日

相关论文

Trading Data for Wind Power Forecasting: A Regression Market with Lasso Regularization

Arxiv

0+阅读 · 2021年10月14日

The Time Domain Linear Sampling Method for Determining the Shape of a Scatterer using Electromagnetic Waves

The Time Domain Linear Sampling Method for Determining the Shape of a Scatterer using Electromagnetic Waves

Arxiv

0+阅读 · 2021年10月14日

Preconditioners for robust optimal control problems under uncertainty

Preconditioners for robust optimal control problems under uncertainty

Arxiv

0+阅读 · 2021年10月14日

Integrated Path Planning and Tracking Control of Marine Current Turbine in Uncertain Ocean Environments

Arxiv

0+阅读 · 2021年10月14日

A Contraction Theory Approach to Optimization Algorithms from Acceleration Flows

Arxiv

0+阅读 · 2021年10月13日

On the Double Descent of Random Features Models Trained with SGD

Arxiv

0+阅读 · 2021年10月13日

Towards formalization and monitoring of microscopic traffic parameters using temporal logic

Arxiv

0+阅读 · 2021年10月12日

Worst-case Delay Bounds in Time-Sensitive Networks with Packet Replication and Elimination

Arxiv

0+阅读 · 2021年10月12日

Learning Multiple Stock Trading Patterns with Temporal Routing Adaptor and Optimal Transport

Arxiv

8+阅读 · 2021年6月25日

The Search Problem in Mixture Models

Arxiv

3+阅读 · 2018年2月24日

微信扫码咨询专知VIP会员