In financial credit scoring, loan applications may be approved or rejected. We can only observe default/non-default labels for approved samples but have no observations for rejected samples, which leads to missing-not-at-random selection bias. Machine learning models trained on such biased data are inevitably unreliable. In this work, we find that the default/non-default classification task and the rejection/approval classification task are highly correlated, according to both real-world data study and theoretical analysis. Consequently, the learning of default/non-default can benefit from rejection/approval. Accordingly, we for the first time propose to model the biased credit scoring data with Multi-Task Learning (MTL). Specifically, we propose a novel Reject-aware Multi-Task Network (RMT-Net), which learns the task weights that control the information sharing from the rejection/approval task to the default/non-default task by a gating network based on rejection probabilities. RMT-Net leverages the relation between the two tasks that the larger the rejection probability, the more the default/non-default task needs to learn from the rejection/approval task. Furthermore, we extend RMT-Net to RMT-Net++ for modeling scenarios with multiple rejection/approval strategies. Extensive experiments are conducted on several datasets, and strongly verifies the effectiveness of RMT-Net on both approved and rejected samples. In addition, RMT-Net++ further improves RMT-Net's performances.
翻译:在金融信用评分中,贷款申请可能会被批准或拒绝。 因此,我们只能对核准的样本遵守默认/非违约标签,但对被拒绝的样本则不做观察,从而导致缺失非随机选择偏差。 以这种偏差数据培训的机器学习模式不可避免地不可靠。 在这项工作中,我们发现,根据真实世界数据研究和理论分析,默认/非违约分类任务和拒绝/核准分类任务高度相关。 因此,对默认/非违约的学习可以受益于拒绝/认可。 因此,我们首次提议用多塔斯克学习(MTL)来模拟有偏差的信用评分数据。 具体而言,我们提议建立一个新颖的拒绝-aware多塔斯克网络(RMT-Net)网络(RMT-Net)网络(RT-NT)网络(RMT-Net)网络(RMB-MT-RMT-RMT-RMT-RMT-MT-NB) 和多级测试/MT-MT-RMT-NB-NB-NB-S-S-S-S-BRMT-F-IBT-BT-S-S-S-S-ID-S-S-S-ID-ID-ID-ID-S-BT-S-ID-IF-IF-S-S-ID-S-S-ID-SD-ID-F-S-F-ID-SD-F-S-ID-ID-ID-ID-ID-ID-ID-IF-IF-S-S-F-S-S-S-S-S-S-S-S-S-ID-S-S-S-S-S-T-ID-T-T-ID-S-S-T-S-SD-SD-IF-IF-SBT-IF-T-T-IF-ID-T-T-IF-IF-S-T-S-T-ID-ID-ID-S-S-ID-ID-S-IF-S-T-S-T-S-T-S-