Data transformation (DT) is a process that transfers the original data into a form which supports a particular classification algorithm and helps to analyze the data for a special purpose. To improve the prediction performance we investigated various data transform methods. This study is conducted in a customer churn prediction (CCP) context in the telecommunication industry (TCI), where customer attrition is a common phenomenon. We have proposed a novel approach of combining data transformation methods with the machine learning models for the CCP problem. We conducted our experiments on publicly available TCI datasets and assessed the performance in terms of the widely used evaluation measures (e.g. AUC, precision, recall, and F-measure). In this study, we presented comprehensive comparisons to affirm the effect of the transformation methods. The comparison results and statistical test proved that most of the proposed data transformation based optimized models improve the performance of CCP significantly. Overall, an efficient and optimized CCP model for the telecommunication industry has been presented through this manuscript.
翻译:数据转换(DT)是一个过程,将原始数据转换成一种支持特定分类算法和帮助分析特殊目的数据的形式。为了改进我们调查的各种数据转换方法的预测性能,我们调查了各种数据转换方法。这项研究是在电信业(TCI)的客户量值预测(CCP)范围内进行的,客户自然减员是一个常见现象。我们提出了一种新颖的办法,将数据转换方法与CCP问题的机器学习模型结合起来。我们用公开提供的TCI数据集进行了实验,并根据广泛使用的评价措施(如ACUC、精确度、回溯度和F-度量度)评估了业绩。我们在这次研究中提供了全面的比较,以确认转换方法的效果。比较结果和统计测试证明,大多数基于优化模型的拟议数据转换方法大大改善了CCP的性能。总的来说,通过这一手稿,提出了电信业高效和优化的CCP模型。