Income verification is the problem of validating a person's stated income given basic identity information such as name, location, job title and employer. It is widely used in the context of mortgage lending, rental applications and other financial risk models. However, the current processes surrounding verification involve significant human effort and document gathering which can be both time-consuming and expensive. In this paper, we propose a novel model for verifying an individual's income given very limited identity information typically available in loan applications. Our model is a combination of a deep neural network and hand-engineered features. The hand engineered features are based upon matching the input information against income records extracted automatically from various publicly available online sources (e.g. payscale.com, H-1B filings, government employee salaries). We conduct experiments on two data sets, one simulated from H-1B records and the other from a real-world data set of peer-to-peer (P2P) loan applications obtained from one of the world's largest P2P lending platform. Our results show a significant reduction in error of 3-6% relative to several strong baselines. We also perform ablation studies to demonstrate that a combined model is indeed necessary to achieve state-of-the-art performance on this task.
翻译:收入核查是核实一个人的申报收入,提供姓名、地点、职称和雇主等基本身份信息的问题,在抵押贷款、租赁申请和其他金融风险模式中广泛使用,但目前的核查过程涉及大量的人力工作和文件收集,这种工作既费时又费钱。在本文件中,我们提出了一个用于核实个人收入的新模式,因为通常在贷款申请中可以获得的身份信息非常有限。我们的模型是深层神经网络和手工设计功能的组合。手动设计功能的基础是将输入信息与从各种公开在线来源(例如工资标准.com、H-1B档案、政府雇员工资)自动提取的收入记录相匹配。我们还对两套数据进行了实验,其中一套来自H-1B记录模拟,另一套来自从世界最大的P2P贷款平台获得的真实世界数据套贷款申请。我们的结果显示,相对于几个强有力的基线,有3.6%的误差。我们还进行了对比研究,以证明,一个综合模型确实需要实现这一任务。