Click-through rate (CTR) prediction plays a critical role in recommender systems and online advertising. The data used in these applications are multi-field categorical data, where each feature belongs to one field. Field information is proved to be important and there are several works considering fields in their models. In this paper, we proposed a novel approach to model the field information effectively and efficiently. The proposed approach is a direct improvement of FwFM, and is named as Field-matrixed Factorization Machines (FmFM, or $FM^2$). We also proposed a new explanation of FM and FwFM within the FmFM framework, and compared it with the FFM. Besides pruning the cross terms, our model supports field-specific variable dimensions of embedding vectors, which acts as soft pruning. We also proposed an efficient way to minimize the dimension while keeping the model performance. The FmFM model can also be optimized further by caching the intermediate vectors, and it only takes thousands of floating-point operations (FLOPs) to make a prediction. Our experiment results show that it can out-perform the FFM, which is more complex. The FmFM model's performance is also comparable to DNN models which require much more FLOPs in runtime.
翻译:点击率( CTR) 预测在推荐系统和在线广告中起着关键作用。 这些应用中所使用的数据是多字段绝对数据,其中每个特性都属于一个字段。 实地信息被证明是重要的, 有好几项工作考虑模型中的字段。 在本文中, 我们提出了一种创新方法, 以有效和高效的方式模拟外地信息。 拟议的方法是FwFM 的直接改进, 并被命名为FMFM 的现场配给保理机( FmFM, 或 $FM2$)。 我们还提议了FM FM 框架内调频和FwFM 的新解释, 并将其与实况调查团进行比较。 除了运行交叉术语外, 我们的模式支持嵌入矢量的外地特定变量层面, 其作用是软化。 我们还提出了在保持模型性能的同时将维度最小化的有效方法。 FmFMM 模型还可以通过封存中间矢量矢量来进一步优化, 并且只需要数千个浮点操作( FFLOPs) 来做出预测。 我们的实验结果显示它能够超越FOP 模型, 的Dperforform 模型也比较复杂。 FMMMS 模式。 FMMS 。