Representation learning has been a critical topic in machine learning. In Click-through Rate Prediction, most features are represented as embedding vectors and learned simultaneously with other parameters in the model. With the development of CTR models, feature representation learning has become a trending topic and has been extensively studied by both industrial and academic researchers in recent years. This survey aims at summarizing the feature representation learning in a broader picture and pave the way for future research. To achieve such a goal, we first present a taxonomy of current research methods on feature representation learning following two main issues: (i) which feature to represent and (ii) how to represent these features. Then we give a detailed description of each method regarding these two issues. Finally, the review concludes with a discussion on the future directions of this field.
翻译:代表性学习是机器学习的一个关键主题。在点击率预测中,大多数特征被作为嵌入矢量,并与模型中的其他参数同时学习。随着CTR模型的开发,特征代表性学习已成为一个趋势性专题,近年来产业和学术研究人员对此进行了广泛研究。这项调查旨在从更广的视角总结特征表现学习,并为今后的研究铺平道路。为了实现这一目标,我们首先根据两个主要问题对当前特征表现学习研究方法进行分类:(一) 哪些特征代表这些特征,以及(二) 如何代表这些特征。然后我们详细说明关于这两个问题的每一种方法。最后,审查最后讨论了该领域的未来方向。