Click-through rate (CTR) prediction is one of the fundamental tasks for online advertising and recommendation. While multi-layer perceptron (MLP) serves as a core component in many deep CTR prediction models, it has been widely recognized that applying a vanilla MLP network alone is inefficient in learning multiplicative feature interactions. As such, many two-stream interaction models (e.g., DeepFM and DCN) have been proposed by integrating an MLP network with another dedicated network for enhanced CTR prediction. As the MLP stream learns feature interactions implicitly, existing research focuses mainly on enhancing explicit feature interactions in the complementary stream. In contrast, our empirical study shows that a well-tuned two-stream MLP model that simply combines two MLPs can even achieve surprisingly good performance, which has never been reported before by existing work. Based on this observation, we further propose feature selection and interaction aggregation layers that can be easily plugged to make an enhanced two-stream MLP model, FinalMLP. In this way, it not only enables differentiated feature inputs but also effectively fuses stream-level interactions across two streams. Our evaluation results on four open benchmark datasets as well as an online A/B test in our industrial system show that FinalMLP achieves better performance than many sophisticated two-stream CTR models. Our source code will be available at MindSpore/models and FuxiCTR/model_zoo.
翻译:点击率(CTR)预测是在线广告和推荐的基本任务之一。虽然多层感知器(MLP)在许多深度CTR预测模型中作为核心组件,但广泛认为仅应用基本MLP网络效率低下,在学习乘法特征交互方面表现不佳。因此,许多两级交互模型(例如DeepFM和DCN)通过将MLP网络与另一个专用网络集成来提高CTR预测。由于MLP流隐式地学习特征交互,现有研究主要关注在补充流中提高显式特征交互。相反,我们的实证研究表明,一个调整良好的两级MLP模型,简单地组合两个MLP甚至可以实现惊人的良好性能,这在现有工作中以前从来没有报道过。基于这一观察结果,我们进一步提出特征选择和交互聚合层,可轻松插入进行增强的两级MLP模型,即FinalMLP。通过这种方式,它不仅能够实现不同的特征输入,还能有效地融合两个流的流级交互。我们对四个公开基准数据集以及我们工业系统中的在线A/B测试的评估结果表明,FinalMLP的性能优于许多复杂的两流CTR模型。我们的原始代码将可在MindSpore/models和FuxiCTR/model_zoo中获得。