In business retention, churn prevention has always been a major concern. This work contributes to this domain by formalizing the problem of churn prediction in the context of online gambling as a binary classification task. We also propose an algorithmic answer to this problem based on recurrent neural network. This algorithm is tested with online gambling data that have the form of time series, which can be efficiently processed by recurrent neural networks. To evaluate the performances of the trained models, standard machine learning metrics were used, such as accuracy, precision and recall. For this problem in particular, the conducted experiments allowed to assess that the choice of a specific architecture depends on the metric which is given the greatest importance. Architectures using nBRC favour precision, those using LSTM give better recall, while GRU-based architectures allow a higher accuracy and balance two other metrics. Moreover, further experiments showed that using only the more recent time-series histories to train the networks decreases the quality of the results. We also study the performances of models learned at a specific instant $t$, at other times $t^{\prime} > t$. The results show that the performances of the models learned at time $t$ remain good at the following instants $t^{\prime} > t$, suggesting that there is no need to refresh the models at a high rate. However, the performances of the models were subject to noticeable variance due to one-off events impacting the data.
翻译:在企业保留方面,预防工作一直是人们关注的一个主要问题。这项工作通过将在线赌博背景下的热量预测问题正规化而成为二进制的分类任务,为这一领域作出了贡献。我们还提出基于经常性神经网络的算法答案。这种算法是用具有时间序列形式的在线赌博数据测试的,时间序列的形式可以由经常性神经网络有效处理。为了评价经过培训的模型的性能,使用了标准的机器学习尺度,例如准确性、准确性和回顾性。特别是对于这一问题,所进行的实验可以评估特定结构的选择取决于具有最大重要性的计量。使用nBRC的建筑更赞成精确性,使用LSTM的建筑给予更好的回顾,而基于GRU的建筑允许更高的准确性和平衡另外两种计量。此外,进一步的实验表明,仅使用最新的时间序列历史来培训网络,结果的质量就会下降。我们还研究了在某个特定时间(美元)所学的模型的性能,而在其他时间(美元) > 。使用nBRC的建筑更精确性,那些使用LSTM的建筑更精确性能让人回忆回回回,而GRU的建筑的性能显示,但是,在某个时候,在某个时间里程的模型的性能仍然需要保持一个高的模型。