Cross features play an important role in click-through rate (CTR) prediction. Most of the existing methods adopt a DNN-based model to capture the cross features in an implicit manner. These implicit methods may lead to a sub-optimized performance due to the limitation in explicit semantic modeling. Although traditional statistical explicit semantic cross features can address the problem in these implicit methods, it still suffers from some challenges, including lack of generalization and expensive memory cost. Few works focus on tackling these challenges. In this paper, we take the first step in learning the explicit semantic cross features and propose Pre-trained Cross Feature learning Graph Neural Networks (PCF-GNN), a GNN based pre-trained model aiming at generating cross features in an explicit fashion. Extensive experiments are conducted on both public and industrial datasets, where PCF-GNN shows competence in both performance and memory-efficiency in various tasks.
翻译:现有方法大多采用基于DNN的模型,以隐含方式捕捉交叉特征。这些隐含方法可能会由于明确语义模型的局限性而导致次级优化性性能。虽然传统的统计显性语义交叉特征可以解决这些隐含方法中的问题,但它仍然受到一些挑战,包括缺乏概括性和昂贵的记忆成本。很少注重应对这些挑战。在本文件中,我们迈出第一步,学习明确的语义交叉特征,并提出预先培训的跨语义学习神经网络(PCF-GNN),即基于GNN的预先培训模型,目的是以明确的方式生成交叉特征。对公共和工业数据集进行了广泛的实验,PCF-GNN在其中展示了各种任务的业绩和记忆效率。