Money laundering is a profound, global problem. Nonetheless, there is little statistical and machine learning research on the topic. In this paper, we focus on anti-money laundering in banks. To help organize existing research in the field, we propose a unifying terminology and provide a review of the literature. This is structured around two central tasks: (i) client risk profiling and (ii) suspicious behavior flagging. We find that client risk profiling is characterized by diagnostics, i.e., efforts to find and explain risk factors. Suspicious behavior flagging, on the other hand, is characterized by non-disclosed features and hand-crafted risk indices. Finally, we discuss directions for future research. One major challenge is the lack of public data sets. This may, potentially, be addressed by synthetic data generation. Other possible research directions include semi-supervised and deep learning, interpretability and fairness of the results.
翻译:洗钱是一个深刻的全球性问题。然而,关于这一主题的统计和机器学习研究很少。在本文中,我们侧重于银行的反洗钱。为了帮助组织实地的现有研究,我们建议统一术语,并审查文献。这分为两个核心任务:(一) 客户风险分析,和(二) 可疑行为示意。我们发现客户风险分析的特点是诊断,即努力寻找和解释风险因素。另一方面,可疑行为标记的特点是未披露特征和手工制作的风险指数。最后,我们讨论未来研究的方向。一个重大挑战是缺乏公共数据集。这有可能通过合成数据生成加以解决。其他可能的研究方向包括半监督和深入学习、可解释性和结果的公平性。