Feature interaction has been recognized as an important problem in machine learning, which is also very essential for click-through rate (CTR) prediction tasks. In recent years, Deep Neural Networks (DNNs) can automatically learn implicit nonlinear interactions from original sparse features, and therefore have been widely used in industrial CTR prediction tasks. However, the implicit feature interactions learned in DNNs cannot fully retain the complete representation capacity of the original and empirical feature interactions (e.g., cartesian product) without loss. For example, a simple attempt to learn the combination of feature A and feature B <A, B> as the explicit cartesian product representation of new features can outperform previous implicit feature interaction models including factorization machine (FM)-based models and their variations. In this paper, we propose a Co-Action Network (CAN) to approximate the explicit pairwise feature interactions without introducing too many additional parameters. More specifically, giving feature A and its associated feature B, their feature interaction is modeled by learning two sets of parameters: 1) the embedding of feature A, and 2) a Multi-Layer Perceptron (MLP) to represent feature B. The approximated feature interaction can be obtained by passing the embedding of feature A through the MLP network of feature B. We refer to such pairwise feature interaction as feature co-action, and such a Co-Action Network unit can provide a very powerful capacity to fitting complex feature interactions. Experimental results on public and industrial datasets show that CAN outperforms state-of-the-art CTR models and the cartesian product method. Moreover, CAN has been deployed in the display advertisement system in Alibaba, obtaining 12\% improvement on CTR and 8\% on Revenue Per Mille (RPM), which is a great improvement to the business.


翻译:在机器学习中,人们认识到,机体学习中的隐含特征互动是一个重要的问题,这对于点击通速(CTR)预测任务也非常重要。近年来,深神经网络(DNNS)可以自动从原始的稀少特性中学习隐含的非线性互动,因此在工业CTR预测任务中广泛使用。然而,在DNNS中学习的隐含特征互动不能完全保留原始和经验特征互动(如cartesian 产物)的完整代表能力而不亏损。例如,简单尝试学习功能A和特征B<A,B>的组合,作为新功能的直观碳酸产品代表了以前的隐含特征互动模型,包括基于因子化(FM)的模型及其变异。在本文中,我们提议建立一个共同行动网络,在不引入太多额外参数的情况下,可以完全保留原样A及其相关特性B,通过学习两套参数来模拟其特征互动。在功能A上嵌入,而在功能A,和B级服务器上显示一个多功能的自动显示系统(MLP),通过B级数据库显示这种特性的特性,通过B级数据显示一个功能。

0
下载
关闭预览

相关内容

IFIP TC13 Conference on Human-Computer Interaction是人机交互领域的研究者和实践者展示其工作的重要平台。多年来,这些会议吸引了来自几个国家和文化的研究人员。官网链接:http://interact2019.org/
多标签学习的新趋势(2020 Survey)
专知会员服务
41+阅读 · 2020年12月6日
100+篇《自监督学习(Self-Supervised Learning)》论文最新合集
专知会员服务
164+阅读 · 2020年3月18日
Stabilizing Transformers for Reinforcement Learning
专知会员服务
59+阅读 · 2019年10月17日
Keras François Chollet 《Deep Learning with Python 》, 386页pdf
专知会员服务
151+阅读 · 2019年10月12日
已删除
创业邦杂志
5+阅读 · 2019年3月27日
Hierarchical Disentangled Representations
CreateAMind
4+阅读 · 2018年4月15日
Arxiv
5+阅读 · 2021年4月21日
Arxiv
4+阅读 · 2016年9月20日
VIP会员
相关资讯
已删除
创业邦杂志
5+阅读 · 2019年3月27日
Hierarchical Disentangled Representations
CreateAMind
4+阅读 · 2018年4月15日
Top
微信扫码咨询专知VIP会员