Embedding learning is an important technique in deep recommendation models to map categorical features to dense vectors. However, the embedding tables often demand an extremely large number of parameters, which become the storage and efficiency bottlenecks. Distributed training solutions have been adopted to partition the embedding tables into multiple devices. However, the embedding tables can easily lead to imbalances if not carefully partitioned. This is a significant design challenge of distributed systems named embedding table sharding, i.e., how we should partition the embedding tables to balance the costs across devices, which is a non-trivial task because 1) it is hard to efficiently and precisely measure the cost, and 2) the partition problem is known to be NP-hard. In this work, we introduce our novel practice in Meta, namely AutoShard, which uses a neural cost model to directly predict the multi-table costs and leverages deep reinforcement learning to solve the partition problem. Experimental results on an open-sourced large-scale synthetic dataset and Meta's production dataset demonstrate the superiority of AutoShard over the heuristics. Moreover, the learned policy of AutoShard can transfer to sharding tasks with various numbers of tables and different ratios of the unseen tables without any fine-tuning. Furthermore, AutoShard can efficiently shard hundreds of tables in seconds. The effectiveness, transferability, and efficiency of AutoShard make it desirable for production use. Our algorithms have been deployed in Meta production environment. A prototype is available at https://github.com/daochenzha/autoshard
翻译:嵌入式学习是深层建议模型中绘制密度矢量绝对特性的一个重要技术。然而,嵌入式表格往往需要数量极多的参数,成为存储和效率瓶颈。已经采用了分布式培训解决方案将嵌入表分割成多个设备。然而,嵌入表如果不仔细分割,很容易导致失衡。这是分布式系统在设计上面临的一个重大挑战,这些系统的名称是嵌入表碎片,也就是说,我们应该如何将嵌入表分割以平衡各设备的成本,这是一个非三角任务,因为1)很难高效和准确地计量成本,2)分配问题已经是NP-硬的。在这项工作中,我们引入了我们在Meta的新做法,即AutoShard,它使用一个神经成本模型直接预测多表的成本,并利用深度强化学习解决分区问题。在开放源大型合成数据集和Meta的生产数据集上,实验结果显示AutoShard的优越性比重,此外,AutoShard在不易变现的流程中,AxalSharter-de-deal-deal-deal-dealtraffilational commad dal laft dass dass dable dable dal dal dable disalbalbalbal commad dald daldal comm dal dal dal commal dal dal lift dal commits 在不需。在不使用任何可变硬性表格中, ex daldaldal-dal-daldald dal-daltialtialtibaltibald dird dald dirgald dird dird dirgald dird dird dald dald dald dirbald dirbald daldaldaldaldaldald combald combald sald combald dald dald sal combald combald combal 中,可以将可转换表表上,在不易可转换中,可以将可转换。在不易易易可转换平版中,在不易易易易可转换平平平平平平平流的