Implementing systems based on Machine Learning to detect fraud and other Non-Technical Losses (NTL) is challenging: the data available is biased, and the algorithms currently used are black-boxes that cannot be either easily trusted or understood by stakeholders. This work explains our human-in-the-loop approach to mitigate these problems in a real system that uses a supervised model to detect Non-Technical Losses (NTL) for an international utility company from Spain. This approach exploits human knowledge (e.g. from the data scientists or the company's stakeholders) and the information provided by explanatory methods to guide the system during the training process. This simple, efficient method that can be easily implemented in other industrial projects is tested in a real dataset and the results show that the derived prediction model is better in terms of accuracy, interpretability, robustness and flexibility.
翻译:执行基于机器学习系统以发现欺诈和其他非技术损失(NTL)的系统具有挑战性:现有数据存在偏差,目前使用的算法是黑箱,利益攸关方无法轻易信任或理解,这项工作解释了我们在实际系统中如何使用监督模型为西班牙一家国际公用事业公司检测非技术损失(NTL),在实际系统中如何减轻这些问题。这种方法利用人类知识(例如来自数据科学家或公司利益攸关方)和解释方法提供的信息来指导培训过程中的系统。在其他工业项目中易于执行的这一简单有效的方法,在真实数据集中进行测试,结果显示衍生的预测模型在准确性、可解释性、稳健性和灵活性方面效果更好。