Click-through rate (CTR) prediction is one of the fundamental tasks for online advertising and recommendation. While multi-layer perceptron (MLP) serves as a core component in many deep CTR prediction models, it has been widely recognized that applying a vanilla MLP network alone is inefficient in learning multiplicative feature interactions. As such, many two-stream interaction models (e.g., DeepFM and DCN) have been proposed by integrating an MLP network with another dedicated network for enhanced CTR prediction. As the MLP stream learns feature interactions implicitly, existing research focuses mainly on enhancing explicit feature interactions in the complementary stream. In contrast, our empirical study shows that a well-tuned two-stream MLP model that simply combines two MLPs can even achieve surprisingly good performance, which has never been reported before by existing work. Based on this observation, we further propose feature selection and interaction aggregation layers that can be easily plugged to make an enhanced two-stream MLP model, FinalMLP. In this way, it not only enables differentiated feature inputs but also effectively fuses stream-level interactions across two streams. Our evaluation results on four open benchmark datasets as well as an online A/B test in our industrial system show that FinalMLP achieves better performance than many sophisticated two-stream CTR models. Our source code will be available at MindSpore/models and FuxiCTR/model_zoo.
翻译:点击率(CTR)预测是在线广告和推荐中的基本任务之一。尽管多层感知器(MLP)在许多深度CTR预测模型中作为核心组件,但已广泛认为单独应用香草MLP网络效率低下,无法学习多因素特征交互。因此,许多两流交互模型(例如DeepFM和DCN)已经提出,通过将MLP网络与另一个专用网络集成,以提高CTR预测。由于MLP流隐式学习特征交互,因此现有研究主要集中于增强补充流中的显式特征交互。相反,我们的实证研究表明,简单地将两个MLP组合在一起的良好调整的双流MLP模型甚至可以实现惊人的表现,这从未被现有工作所报道。基于这一观察结果,我们进一步提出特征选择和交互聚合层,可以轻松地插入使增强的双流MLP模型FinalMLP。以这种方式,它不仅实现了不同的特征输入,还有效地融合了两个流的流级交互。我们在四个公开基准数据集以及我们工业系统的在线A / B测试上的评估结果显示,FinalMLP比许多复杂的两流CTR模型具有更好的性能。我们的源代码将在MindSpore / models和FuxiCTR / model_zoo上提供。