修订版变换器: 获得无无的回转 (Revision Transformers: Getting RiT of No-Nos)

Current transformer language models (LM) are large-scale models with billions of parameters. They have been shown to provide high performances on a variety of tasks but are also prone to shortcut learning and bias. Addressing such incorrect model behavior via parameter adjustments is very costly. This is particularly problematic for updating dynamic concepts, such as moral values, which vary culturally or interpersonally. In this work, we question the current common practice of storing all information in the model parameters and propose the Revision Transformer (RiT) employing information retrieval to facilitate easy model updating. The specific combination of a large-scale pre-trained LM that inherently but also diffusely encodes world knowledge with a clear-structured revision engine makes it possible to update the model's knowledge with little effort and the help of user interaction. We exemplify RiT on a moral dataset and simulate user feedback demonstrating strong performance in model revision even with small data. This way, users can easily design a model regarding their preferences, paving the way for more transparent and personalized AI models.

翻译：目前的变压器语言模型(LM)是具有数十亿参数的大型模型,显示它们为各种任务提供了高超的性能,但也容易出现捷径学习和偏差。通过参数调整处理这种不正确的模型行为非常昂贵。这对更新动态概念尤其有问题,例如道德价值,这些道德价值在文化上或人际上各不相同。在这项工作中,我们质疑目前将所有信息储存在模型参数中的共同做法,并提议采用信息检索系统,利用信息检索系统来方便更新模型。大规模预先培训的LM的具体组合,它既能将世界知识与清晰的修改引擎相拼凑,又能分散地编码成一个清晰的修改引擎,使得能够以很少的努力和用户互动的帮助来更新模型的知识。我们用道德数据集进行示范,并模拟用户反馈,表明即使在使用小数据进行模型修改时也表现良好。这样,用户可以很容易地设计一个关于其喜好的模式,为更加透明和个性化的AI模型铺平道路。

相关内容

MoDELS

关注 43

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

NeurlPS 2022 | 自然语言处理相关论文分类整理

专知会员服务

51+阅读 · 2022年10月2日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

最新《Transformers模型》教程，64页ppt

专知会员服务

324+阅读 · 2020年11月26日

【论文翻译】NLP注意力机制综述论文翻译，Attention, please! A Critical Review of Neural Attention Models in Natural Language Processing

专知会员服务

96+阅读 · 2020年4月18日