Despite accounting for 96.1% of all businesses in Malaysia, access to financing remains one of the most persistent challenges faced by Micro, Small, and Medium Enterprises (MSMEs). Newly established or young businesses are often excluded from formal credit markets as traditional underwriting approaches rely heavily on credit bureau data. This study investigates the potential of bank statement data as an alternative data source for credit assessment to promote financial inclusion in emerging markets. Firstly, we propose a cash flow-based underwriting pipeline where we utilise bank statement data for end to end data extraction and machine learning credit scoring. Secondly, we introduce a novel dataset of 611 loan applicants from a Malaysian lending institution. Thirdly, we develop and evaluate credit scoring models based on application information and bank transaction-derived features. Empirical results show that the use of such data boosts the performance of all models on our dataset, which can improve credit scoring for new-to-lending MSMEs. Lastly, we intend to release the anonymised bank transaction dataset to facilitate further research on MSMEs financial inclusion within Malaysia's emerging economy.
翻译:暂无翻译