Cross-domain recommendation (CDR) is an effective way to alleviate the data sparsity problem. Content-based CDR is one of the most promising branches since most kinds of products can be described by a piece of text, especially when cold-start users or items have few interactions. However, two vital issues are still under-explored: (1) From the content modeling perspective, sufficient long-text descriptions are usually scarce in a real recommender system, more often the light-weight textual features, such as a few keywords or tags, are more accessible, which is improperly modeled by existing methods. (2) From the CDR perspective, not all inter-domain interests are helpful to infer intra-domain interests. Caused by domain-specific features, there are part of signals benefiting for recommendation in the source domain but harmful for that in the target domain. Therefore, how to distill useful interests is crucial. To tackle the above two problems, we propose a metapath and multi-interest aggregated graph neural network (M2GNN). Specifically, to model the tag-based contents, we construct a heterogeneous information network to hold the semantic relatedness between users, items, and tags in all domains. The metapath schema is predefined according to domain-specific knowledge, with one metapath for one domain. User representations are learned by GNN with a hierarchical aggregation framework, where the intra-metapath aggregation firstly filters out trivial tags and the inter-metapath aggregation further filters out useless interests. Offline experiments and online A/B tests demonstrate that M2GNN achieves significant improvements over the state-of-the-art methods and current industrial recommender system in Dianping, respectively. Further analysis shows that M2GNN offers an interpretable recommendation.
翻译:跨领域推荐是缓解数据稀疏性问题的有效方法。基于内容的跨领域推荐是其中最有前途的分支之一,因为大多数产品可以用一小段文本描述,特别是当冷启动用户或产品的交互较少时。然而,仍存在两个重要问题:(1)从内容建模的角度来看,在实际的推荐系统中,足够长的文本描述通常很少,更常见的是轻量级的文本特征,比如几个关键词或标签,这种情况没有得到充分的建模;(2)从跨领域推荐的角度来看,不是所有跨领域的兴趣都对推断领域内的兴趣有帮助。由于领域特定的特征,在源领域中产生的部分信号可能有助于推荐,但对目标领域的推荐有害。因此,如何提炼有用的兴趣是至关重要的。为了解决上述两个问题,我们提出了一种MetaPath和多兴趣聚合的图神经网络(M2GNN)。具体地,为了对基于标签的内容进行建模,我们构建了一个异构信息网络,用于保持所有领域中用户、物品和标签之间的语义相关性。MetaPath方案根据领域特定的知识预定义,每个领域有一个MetaPath。采用分层聚合框架的GNN学习用户表示,其中内部MetaPath聚合首先过滤掉微不足道的标签,而跨MetaPath聚合则进一步过滤掉无效的兴趣。离线实验和在线A / B测试表明,M2GNN相比最先进的方法和当前的大众点评工业推荐系统均取得了显著的改进。进一步的分析显示,M2GNN提供了可解释的推荐。