通过与 doc2vec 混合过滤学习持续用户演示 (Learning Continuous User Representations through Hybrid Filtering with doc2vec)

Players in the online ad ecosystem are struggling to acquire the user data required for precise targeting. Audience look-alike modeling has the potential to alleviate this issue, but models' performance strongly depends on quantity and quality of available data. In order to maximize the predictive performance of our look-alike modeling algorithms, we propose two novel hybrid filtering techniques that utilize the recent neural probabilistic language model algorithm doc2vec. We apply these methods to data from a large mobile ad exchange and additional app metadata acquired from the Apple App store and Google Play store. First, we model mobile app users through their app usage histories and app descriptions (user2vec). Second, we introduce context awareness to that model by incorporating additional user and app-related metadata in model training (context2vec). Our findings are threefold: (1) the quality of recommendations provided by user2vec is notably higher than current state-of-the-art techniques. (2) User representations generated through hybrid filtering using doc2vec prove to be highly valuable features in supervised machine learning models for look-alike modeling. This represents the first application of hybrid filtering user models using neural probabilistic language models, specifically doc2vec, in look-alike modeling. (3) Incorporating context metadata in the doc2vec model training process to introduce context awareness has positive effects on performance and is superior to directly including the data as features in the downstream supervised models.

翻译：在线广告生态系统中的玩家正在奋力获取精确定位所需的用户数据。类似观光模型的模型有可能缓解这一问题, 但模型的性能在很大程度上取决于可用数据的数量和质量。为了最大限度地提高我们外观模型算法的预测性能, 我们提议了两种新型混合过滤技术, 利用最新的神经振荡性语言模型算法 doc2vec 。我们将这些方法应用于大型移动广告交换的数据和从苹果软件商店和谷歌游戏商店获取的额外应用程序元数据。首先, 我们通过应用程序使用历史和应用程序描述(用户2vec)来模拟移动应用程序的用户。其次, 我们通过在模型培训( comtext2vec) 中纳入额外的用户和与应用程序有关的元数据,来引入该模型的背景意识。我们的发现有三重:(1) 用户2 模型所提供的建议的质量明显高于当前的最新技术。 (2) 通过使用 doc2vec 模型生成的混合过滤模型产生的用户代表了监督机器学习模型中非常有价值的功能。这代表了在像样模型背景中应用更高级的升级数据模型, 包括直接引入数据模型。

相关内容

MoDELS

关注 43

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

GRAPH-BERT ：学习图表示只需要注意力，GRAPH-BERT : Only Attention is Needed for Learning Graph Representations

专知会员服务

78+阅读 · 2020年5月31日

因果图，Causal Graphs，52页ppt

专知会员服务

250+阅读 · 2020年4月19日

强化学习的对比无监督表示，CURL: Contrastive Unsupervised Representations for Reinforcement Learning

专知会员服务

41+阅读 · 2020年4月11日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日