The Click-though Rate (CTR) prediction task is a basic task in recommendation system. Most of the previous researches of CTR models built based on Wide \& deep structure and gradually evolved into parallel structures with different modules. However, the simple accumulation of parallel structures can lead to higher structural complexity and longer training time. Based on the Sigmoid activation function of output layer, the linear addition activation value of parallel structures in the training process is easy to make the samples fall into the weak gradient interval, resulting in the phenomenon of weak gradient, and reducing the effectiveness of training. To this end, this paper proposes a Parallel Heterogeneous Network (PHN) model, which constructs a network with parallel structure through three different interaction analysis methods, and uses Soft Selection Gating (SSG) to feature heterogeneous data with different structure. Finally, residual link with trainable parameters are used in the network to mitigate the influence of weak gradient phenomenon. Furthermore, we demonstrate the effectiveness of PHN in a large number of comparative experiments, and visualize the performance of the model in training process and structure.
翻译:点击速率(CTR)预测任务是建议系统的一项基本任务。以前对基于宽度深层结构的CTR模型进行的大部分研究,并逐渐演变成不同模块的平行结构。然而,简单的平行结构积累可能导致结构复杂性提高,培训时间更长。根据产出层的Sigmoid活化功能,在培训过程中平行结构的线性附加激活值很容易使样本落入薄弱的梯度间隔,导致低度梯度现象,并降低培训效果。为此,本文件提议建立一个平行多源网络模型,通过三种不同的互动分析方法,以平行结构建立一个网络,使用软性选择格(SSG)对不同结构的多元数据进行描述。最后,在网络中使用与可培训参数的残余链接,以减轻低度梯度现象的影响。此外,我们还展示了PHN在大量比较实验中的有效性,并直观模型在培训过程和结构中的性能。