Embedding tables are usually huge in click-through rate (CTR) prediction models. To train and deploy the CTR models efficiently and economically, it is necessary to compress their embedding tables at the training stage. To this end, we formulate a novel quantization training paradigm to compress the embeddings from the training stage, termed low-precision training (LPT). Also, we provide theoretical analysis on its convergence. The results show that stochastic weight quantization has a faster convergence rate and a smaller convergence error than deterministic weight quantization in LPT. Further, to reduce the accuracy degradation, we propose adaptive low-precision training (ALPT) that learns the step size (i.e., the quantization resolution) through gradient descent. Experiments on two real-world datasets confirm our analysis and show that ALPT can significantly improve the prediction accuracy, especially at extremely low bit widths. For the first time in CTR models, we successfully train 8-bit embeddings without sacrificing prediction accuracy. The code of ALPT is publicly available.
翻译:嵌入式表格通常在点击率( CTR) 预测模型中是巨大的。 要在培训阶段有效和经济地培训和部署 CTR 模型,就必须在培训阶段压缩其嵌入表。 为此,我们制定了一个新的量化培训模式,以压缩培训阶段的嵌入,称为低精度培训(LPT ) 。此外,我们还就其趋同性提供了理论分析。结果显示,随机重量四分化的趋同率比LPT 的确定性重量定量差得更快和较小。此外,为了降低准确性降解,我们建议采用适应性低精度培训(ALPT ), 通过梯度下降来学习步骤大小(即量化分辨率 ) 。 两个现实世界数据集的实验证实了我们的分析,并表明ALPT 能够显著提高预测准确性, 特别是在极低的宽度范围内。 在CTR 模型中,我们首次成功地培训了8位嵌入值,但没有牺牲预测准确性。 ALPT 代码是公开的。