减少模型Jitter:生产环境中对语义采掘者进行稳定再培训 (Reducing Model Jitter: Stable Re-training of Semantic Parsers in Production Environments)

Retraining modern deep learning systems can lead to variations in model performance even when trained using the same data and hyper-parameters by simply using different random seeds. We call this phenomenon model jitter. This issue is often exacerbated in production settings, where models are retrained on noisy data. In this work we tackle the problem of stable retraining with a focus on conversational semantic parsers. We first quantify the model jitter problem by introducing the model agreement metric and showing the variation with dataset noise and model sizes. We then demonstrate the effectiveness of various jitter reduction techniques such as ensembling and distillation. Lastly, we discuss practical trade-offs between such techniques and show that co-distillation provides a sweet spot in terms of jitter reduction for semantic parsing systems with only a modest increase in resource usage.

翻译：现代深层学习系统的再培训可能导致模型性能的变化,即使经过培训使用相同的数据和超参数,只要使用不同的随机种子即可导致模型性能的变化。我们称这种现象为模型紧张状态。在生产环境中,这一问题往往会更加严重,因为模型根据吵闹的数据进行再培训。在这项工作中,我们处理稳定再培训的问题,重点是谈话语义解析器。我们首先通过采用示范协议衡量标准来量化模型性能紧张状态问题,并显示数据集噪音和模型大小的变异。我们随后展示了各种减少弹道技术(例如编组和蒸馏)的功效。最后,我们讨论了这些技术之间的实际权衡,并表明共同蒸馏为语义解析系统提供了一个精密的裁量点,只有少量的资源使用量。

相关内容

MoDELS

关注 44

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【深度学习表格检测、信息提取和结构化】《Table Detection, Information Extraction and Structuring using Deep Learning》by Vihar Kurama

专知会员服务

38+阅读 · 2020年1月23日