Fine-tuning pre-trained language models improves the quality of commercial reply suggestion systems, but at the cost of unsustainable training times. Popular training time reduction approaches are resource intensive, thus we explore low-cost model compression techniques like Layer Dropping and Layer Freezing. We demonstrate the efficacy of these techniques in large-data scenarios, enabling the training time reduction for a commercial email reply suggestion system by 42%, without affecting the model relevance or user engagement. We further study the robustness of these techniques to pre-trained model and dataset size ablation, and share several insights and recommendations for commercial applications.
翻译:经过训练的预先语言模型的微调提高了商业答复建议系统的质量,但以不可持续的培训时间为代价。大众培训时间削减方法需要大量资源,因此我们探索了低成本模式压缩技术,如层层下降和层冻结。我们在大数据假设中展示了这些技术的功效,使商业电子邮件答复建议系统的培训时间减少42%,同时不影响模型的相关性或用户的参与。我们进一步研究了这些技术对预先培训的模式和数据集规模的缩减的强健性,并分享了有关商业应用的若干见解和建议。