拟议的在信用风险记分卡中进行人口稳定测试的模拟技术 (A proposed simulation technique for population stability testing in credit risk scorecards)

Credit risk scorecards are logistic regression models, fitted to large and complex data sets, employed by the financial industry to model the probability of default of a potential customer. In order to ensure that a scorecard remains a representative model of the population one tests the hypothesis of population stability; specifying that the distribution of clients' attributes remains constant over time. Simulating realistic data sets for this purpose is nontrivial as these data sets are multivariate and contain intricate dependencies. The simulation of these data sets are of practical interest for both practitioners and for researchers; practitioners may wish to consider the effect that a specified change in the properties of the data has on the scorecard and its usefulness from a business perspective, while researchers may wish to test a newly developed technique in credit scoring. We propose a simulation technique based on the specification of bad ratios, this is explained below. Practitioners can generally not be expected to provide realistic parameter values for a scorecard; these models are simply too complex and contain too many parameters to make such a specification viable. However, practitioners can often confidently specify the bad ratio associated with two different levels of a specific attribute. That is, practitioners are often comfortable with making statements such as "on average a new customer is 1.5 times as likely to default as an existing customer with similar attributes". We propose a method which can be used to obtain parameter values for a scorecard based on specified bad ratios. The proposed technique is demonstrated using a realistic example and we show that the simulated data sets adhere closely to the specified bad ratios. The paper provides a link to a github project in which the R code used in order to generate the results shown can be found.

翻译：信用风险记分卡是后勤回归模型,安装在大型和复杂的数据集上,金融业采用这些数据集来模拟潜在客户违约概率。为了确保记分卡继续成为具有代表性的人口模型,为确保记分卡继续成为具有代表性的人口模型,先测试人口稳定性假设;具体说明客户属性的分布在一段时间内保持不变。为此目的模拟现实的数据集是非边际的,因为这些数据集是多变的,包含错综复杂的相互依存关系。这些数据集的模拟对于从业人员和研究人员来说都具有实际意义;从业务角度出发,从业人员不妨考虑数据属性的特定变化对记分卡及其实用性的影响,而研究人员则可能希望测试新开发的信用评分技术。我们提议根据坏比率的规格进行模拟技术,下文对此作出解释。一般无法期望操作者为记分卡提供现实的参数值;这些模型过于复杂,包含太多参数,无法使这种规格变得可行。然而,从业人员往往可以有信心地具体说明与两个不同属性的坏比率相关比率。也就是说,从实际操作者通常都喜欢在信用评分评分方法中进行陈述。我们用了一个类似的标准来显示,我们用了一个错误的比标表示:我们用了一个错误的比标表示的比,我们用了一个新的客户的比重表示了一个比的比,我们用了一个比。我们用了一个不同的标准表示了一个比。我们用了一个比。我们用了一个比,用来显示一个比的比的比的比。我们用了一个比,用来了一个比表示一个比表示一个比。我们用了一个比表示一个比表示一个比,用来了一个比表示一个比,用来一个比。我们用来了一个比的比的比的比的比的比的比一个比一个比一个比一个比。一个比。一个比的比。我们用一个比一个比一个比一个比一个比一个比表示一个比一个比一个比一个比一个比一个比一个比一个比的比一个比一个比一个比一个比,用来了一个比。我们用的比一个比。我们用的比。我们用的比的比,用来表示一个比一个比一个比一个比一个比一个比。一个比一个比一个比一个比一个比一个比一个比一个比一个比一个比一个比一个比一个比一个比一个比一个比一个比一个比一个比一个比一个比一个比一个比一个比一个比一个比一个比

相关内容

MoDELS

关注 43

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

【Google可解释人工智能白皮书】27页pdf，AI Explainability Whitepaper ，Introduction to AI Explanations for AI Platform

专知会员服务

127+阅读 · 2019年12月13日

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

专知会员服务

15+阅读 · 2019年10月23日