Artificial intelligence-based techniques applied to the electricity consumption data generated from the smart grid prove to be an effective solution in reducing Non Technical Loses (NTLs), thereby ensures safety, reliability, and security of the smart energy systems. However, imbalanced data, consecutive missing values, large training times, and complex architectures hinder the real time application of electricity theft detection models. In this paper, we present EnsembleNTLDetect, a robust and scalable electricity theft detection framework that employs a set of efficient data pre-processing techniques and machine learning models to accurately detect electricity theft by analysing consumers' electricity consumption patterns. This framework utilises an enhanced Dynamic Time Warping Based Imputation (eDTWBI) algorithm to impute missing values in the time series data and leverages the Near-miss undersampling technique to generate balanced data. Further, stacked autoencoder is introduced for dimensionality reduction and to improve training efficiency. A Conditional Generative Adversarial Network (CTGAN) is used to augment the dataset to ensure robust training and a soft voting ensemble classifier is designed to detect the consumers with aberrant consumption patterns. Furthermore, experiments were conducted on the real-time electricity consumption data provided by the State Grid Corporation of China (SGCC) to validate the reliability and efficiency of EnsembleNTLDetect over the state-of-the-art electricity theft detection models in terms of various quality metrics.
翻译:智能电网产生的电力消费数据应用人工智能智能智能智能技术,证明是减少非技术损失的有效解决办法,从而确保智能能源系统的安全、可靠性和安全,然而,数据不平衡、连续缺失值、大量培训时间和复杂结构阻碍了电力盗窃探测模型的实时应用。在本文中,我们介绍一个强大和可扩缩的电力盗窃检测框架,该框架使用一套高效的数据处理前技术和机器学习模型,通过分析消费者的电力消费模式,准确检测电力盗窃。这个框架使用一种强化的动态时间扭曲基础质量指数(eDTWBI)算法,在时间序列数据中估算缺失值,利用近距离暗暗暗暗技术生成平衡数据。此外,还引入了堆叠式自动编码,以减少自定义,提高培训效率。 使用一个条件性精准的Adversarial网络(CTGAN)来增加数据集,以确保强有力的培训和软票质的中国电子消费标准分类质量(eDBBBB)计算方法,目的是通过国家对电流数据进行实时测试,从而检测公司对电流数据进行实时升级。