缩放时基于内存的模型编辑 (Memory-Based Model Editing at Scale)

Even the largest neural networks make errors, and once-correct predictions can become invalid as the world changes. Model editors make local updates to the behavior of base (pre-trained) models to inject updated knowledge or correct undesirable behaviors. Existing model editors have shown promise, but also suffer from insufficient expressiveness: they struggle to accurately model an edit's intended scope (examples affected by the edit), leading to inaccurate predictions for test inputs loosely related to the edit, and they often fail altogether after many edits. As a higher-capacity alternative, we propose Semi-Parametric Editing with a Retrieval-Augmented Counterfactual Model (SERAC), which stores edits in an explicit memory and learns to reason over them to modulate the base model's predictions as needed. To enable more rigorous evaluation of model editors, we introduce three challenging language model editing problems based on question answering, fact-checking, and dialogue generation. We find that only SERAC achieves high performance on all three problems, consistently outperforming existing approaches to model editing by a significant margin. Code, data, and additional project information will be made available at https://sites.google.com/view/serac-editing.

翻译：即使是最大的神经网络也会出错, 一旦错误的预测就会随着世界的变化而失效。示范编辑对基础( 预先培训的) 模型的行为进行本地更新, 以注入更新的知识或纠正不良的行为。现有的模型编辑已经表现出希望, 但也存在不足够的表达性: 他们努力精确地模拟编辑的预期范围( 受编辑影响的样本), 导致对与编辑松散相关的测试投入的不准确预测, 并且往往在许多编辑后完全失效。作为更高能力的替代方案, 我们提议半量数编辑, 并配有半差值调整模型, 以明确的记忆储存编辑, 并学习根据需要调整基础模型预测的理由。为了能够更严格地评价模型编辑, 我们引入了三个挑战性的语言模式编辑问题, 其依据是问题回答、事实检查和对话生成。我们发现只有SERAC 在所有三个问题上都取得了高性能, 持续以显著的边距代码、数据和附加项目将在 http:// coms 上提供 / accession 。

相关内容

MoDELS

关注 43

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

【超赞的#C++#速查&信息图】“hacking c++ - Cheat Sheets & Infographics”

专知会员服务

30+阅读 · 2022年3月8日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日