Leaf-FM: 点击率预测的可学习地物生成集成设备 (Leaf-FM: A Learnable Feature Generation Factorization Machine for Click-Through Rate Prediction)

Click-through rate (CTR) prediction plays important role in personalized advertising and recommender systems. Though many models have been proposed such as FM, FFM and DeepFM in recent years, feature engineering is still a very important way to improve the model performance in many applications because using raw features can rarely lead to optimal results. For example, the continuous features are usually transformed to the power forms by adding a new feature to allow it to easily form non-linear functions of the feature. However, this kind of feature engineering heavily relies on peoples experience and it is both time consuming and labor consuming. On the other side, concise CTR model with both fast online serving speed and good model performance is critical for many real life applications. In this paper, we propose LeafFM model based on FM to generate new features from the original feature embedding by learning the transformation functions automatically. We also design three concrete Leaf-FM models according to the different strategies of combing the original and the generated features. Extensive experiments are conducted on three real-world datasets and the results show Leaf-FM model outperforms standard FMs by a large margin. Compared with FFMs, Leaf-FM can achieve significantly better performance with much less parameters. In Avazu and Malware dataset, add version Leaf-FM achieves comparable performance with some deep learning based models such as DNN and AutoInt. As an improved FM model, Leaf-FM has the same computation complexity with FM in online serving phase and it means Leaf-FM is applicable in many industry applications because of its better performance and high computation efficiency.

翻译：点击率(CTR)预测在个性化广告和建议系统中起着重要作用。虽然近年来提出了许多模型,如调频、实况调查团和深调频等,但地物工程仍然是在许多应用中改进模型性能的一个非常重要的方法,因为使用原始特征很少导致最佳结果。例如,连续特征通常通过添加新特征而转化为电源形式,以便很容易地形成原功能和生成功能的非线性功能。然而,这种特征工程在很大程度上依赖人们的经验,而且既耗时又耗时。另一方面,具有快速在线服务速度和良好模型性能的简明CTR模型对于许多真实生活应用程序至关重要。在本论文中,我们提议以调频为基础的Leaf调频模型,通过自动学习转化功能,从原始特征中产生新的特征。我们还根据对原功能和生成特征进行梳理的不同战略设计了三种具体的叶质调频模型。在三个真实世界数据集上进行了广泛的实验,结果显示精质调频计算模型比标准调频值要高得多。在应用的在线应用程序应用中,以调频为基调频模型比值的精度比值更高。在应用中,与调频(Laf-FM)升级的模型可以大大改进,因为调频(Le-FM)的运行的性能能能能能能能能能能能能能能能学比比,因为在高得多,因为在高。