Deep learning has brought significant breakthroughs in sequential recommendation (SR) for capturing dynamic user interests. A series of recent research revealed that models with more parameters usually achieve optimal performance for SR tasks, inevitably resulting in great challenges for deploying them in real systems. Following the simple assumption that light networks might already suffice for certain users, in this work, we propose CANet, a conceptually simple yet very scalable framework for assigning adaptive network architecture in an input-dependent manner to reduce unnecessary computation. The core idea of CANet is to route the input user behaviors with a light-weighted router module. Specifically, we first construct the routing space with various submodels parameterized in terms of multiple model dimensions such as the number of layers, hidden size and embedding size. To avoid extra storage overhead of the routing space, we employ a weight-slicing schema to maintain all the submodels in exactly one network. Furthermore, we leverage several solutions to solve the discrete optimization issues caused by the router module. Thanks to them, CANet could adaptively adjust its network architecture for each input in an end-to-end manner, in which the user preference can be effectively captured. To evaluate our work, we conduct extensive experiments on benchmark datasets. Experimental results show that CANet reduces computation by 55 ~ 65% while preserving the accuracy of the original model. Our codes are available at https://github.com/icantnamemyself/CANet.
翻译:深层学习在获取动态用户兴趣的顺序建议(SR)中带来了重大突破。最近的一系列研究显示,具有更多参数的模型通常能为斯洛伐克任务取得最佳性能,不可避免地给在实际系统中部署这些模型带来巨大的挑战。简单假设光网络可能已经足以满足某些用户,在此工作中,我们提出Canet,这是一个概念简单但非常可扩展的框架,用于以基于投入的方式指定适应性网络架构,以减少不必要的计算。Canet的核心理念是用一个轻量级路由器模块来引导输入用户的行为。具体地说,我们首先用各种子模型来建造路由空间,在多个模式层面进行参数的参数,如层数、隐藏大小和嵌入规模的参数。为了避免路由空间的超储量间接,我们建议使用一个重的、但非常可扩缩的架构,将所有子模型完全放在一个网络中。此外,我们利用几种解决方案来解决路由路由模块模块造成的离式优化问题。通过这些模块,Canet可以调整其网络结构结构结构结构结构结构结构结构,以最终到终端方式对每个输入进行参数的参数进行参数参数参数的参数进行参数参数参数参数,例如层、隐藏层、隐藏的层、隐藏式、隐藏式、隐藏式、隐藏式、隐藏式、隐藏式、隐藏式计算机模型实验性实验性实验性实验性实验性实验性实验性实验性实验性实验性实验性实验性实验性实验性实验性实验性实验性实验性实验性实验性实验性实验性实验性实验性实验性实验性实验性实验性实验性实验性实验性研究。我们能能能能能能能能。我们测量性实验性能。