Although the bipartite shopping graphs are straightforward to model search behavior, they suffer from two challenges: 1) The majority of items are sporadically searched and hence have noisy/sparse query associations, leading to a \textit{long-tail} distribution. 2) Infrequent queries are more likely to link to popular items, leading to another hurdle known as \textit{disassortative mixing}. To address these two challenges, we go beyond the bipartite graph to take a hypergraph perspective, introducing a new paradigm that leverages \underline{auxiliary} information from anonymized customer engagement sessions to assist the \underline{main task} of query-item link prediction. This auxiliary information is available at web scale in the form of search logs. We treat all items appearing in the same customer session as a single hyperedge. The hypothesis is that items in a customer session are unified by a common shopping interest. With these hyperedges, we augment the original bipartite graph into a new \textit{hypergraph}. We develop a \textit{\textbf{D}ual-\textbf{C}hannel \textbf{A}ttention-Based \textbf{H}ypergraph Neural Network} (\textbf{DCAH}), which synergizes information from two potentially noisy sources (original query-item edges and item-item hyperedges). In this way, items on the tail are better connected due to the extra hyperedges, thereby enhancing their link prediction performance. We further integrate DCAH with self-supervised graph pre-training and/or DropEdge training, both of which effectively alleviate disassortative mixing. Extensive experiments on three proprietary E-Commerce datasets show that DCAH yields significant improvements of up to \textbf{24.6\% in mean reciprocal rank (MRR)} and \textbf{48.3\% in recall} compared to GNN-based baselines. Our source code is available at \url{https://github.com/amazon-science/dual-channel-hypergraph-neural-network}.
翻译:虽然双叶购物图对模型搜索行为来说是直截了当的,但它们面临两个挑战:(1) 大多数项目是零星搜索,因此有吵杂/扭曲的查询关联,导致发布\ textit{ long-tail} 分发。(2) 反复查询更有可能与流行项目链接,导致另一个障碍,称为\ textitle{ disassortion mix}。为了应对这两个挑战,我们超越双叶图,从高视角度出发,引入一个新的模式,利用匿名客户接触会中获得的信息,以协助查询-项目链接预测的下流线{mader-straseal 查询。这种辅助信息以搜索日志的形式在网络上提供。我们把在同一客户会场上出现的所有项目作为单一高端处理。假设是,客户会中的项目由共同购物兴趣统一。有了这些高端,我们将原始的双叶图表添加到一个新的 hyal- textrialtrail{hypergraphy}。我们在网络上开发一个不透明的直径直径直径直径直线, 直径直径直线/直径直径可显示。