利用持久记忆丰富模型的信仰观念 (Enriching a Model's Notion of Belief using a Persistent Memory) - 专知论文

会员服务 ·

0

可约的 · MoDELS · 可辨认的 · 语言模型化 · Weight ·

2021 年 4 月 16 日

Enriching a Model's Notion of Belief using a Persistent Memory

翻译：利用持久记忆丰富模型的信仰观念

Nora Kassner,Oyvind Tafjord,Hinrich Schutze,Peter Clark

Although pretrained language models (PTLMs) have been shown to contain significant amounts of world knowledge, they can still produce inconsistent answers to questions when probed, even after using specialized training techniques to reduce inconsistency. As a result, it can be hard to identify what the model actually "believes" about the world. Our goal is to reduce this problem, so systems are more globally consistent and accurate in their answers. Our approach is to add a memory component - a BeliefBank - that records a model's answers, and two mechanisms that use it to improve consistency among beliefs. First, a reasoning component - a weighted SAT solver - improves consistency by flipping answers that significantly clash with others. Second, a feedback component re-queries the model but using known beliefs as context. We show that, in a controlled experimental setting, these two mechanisms improve both accuracy and consistency. This is significant as it is a first step towards endowing models with an evolving memory, allowing them to construct a more coherent picture of the world.

翻译：尽管经过事先训练的语言模型(PTLMS)已经证明含有大量的世界知识,但它们仍然能够对被调查的问题提供不一致的答案,即使使用了专门的培训技术来减少不一致的情况。因此,很难确定该模型实际上什么是世界的“信仰 ” 。我们的目标是减少这个问题, 使系统在答案上更加全球一致和准确。我们的方法是添加一个记忆部分 — — 信仰银行 — — 记录模型的答案,以及两个机制,用它来提高信仰的一致性。首先,一个推理部分 — — 加权SAT求解器 — — 通过翻转与其他人大相矛盾的答案来提高一致性。其次,反馈部分重新检查模型,但用已知的信念作为背景。我们表明,在受控的实验环境中,这两种机制既能提高准确性和一致性,又能提高一致性。这很重要,因为它是朝着以不断演变的记忆来赋予模型以更一致的模型迈出的第一步,使其能够构建出更加一致的世界景象。

0

相关内容

可约的

【南京大学】量子计算 (Spring 2021)课程

专知会员服务

59+阅读 · 2021年4月12日

【斯坦福CS224N硬核课】自然语言生成NLG，79页ppt

专知会员服务

37+阅读 · 2021年2月22日

Python编程基础，121页ppt

Python编程基础，121页ppt

专知会员服务

49+阅读 · 2021年1月1日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

【Python Tricks新书】The book: A Buffet of Awesome Python Features，299页pdf

【Python Tricks新书】The book: A Buffet of Awesome Python Features，299页pdf

专知会员服务

45+阅读 · 2020年1月1日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

开源书：PyTorch深度学习起步

开源书：PyTorch深度学习起步

专知会员服务

51+阅读 · 2019年10月11日

2019年机器学习框架回顾

2019年机器学习框架回顾

专知会员服务

36+阅读 · 2019年10月11日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

意识是一种数学模式

意识是一种数学模式

CreateAMind

3+阅读 · 2019年6月24日

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

AINLP

40+阅读 · 2019年6月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

Facebook PyText 在 Github 上开源了

Facebook PyText 在 Github 上开源了

AINLP

7+阅读 · 2018年12月14日

【泡泡一分钟】一种实用且高效的多视图匹配方法

【泡泡一分钟】一种实用且高效的多视图匹配方法

泡泡机器人SLAM

6+阅读 · 2018年11月19日

【论文推荐】最新八篇视频描述生成相关论文—在线视频理解、联合定位和描述事件、生成视频、跨模态注意力机制、联合事件检测和描述

【论文推荐】最新八篇视频描述生成相关论文—在线视频理解、联合定位和描述事件、生成视频、跨模态注意力机制、联合事件检测和描述

专知

11+阅读 · 2018年6月4日

【论文推荐】最新7篇视觉问答（VQA）相关论文—解释、读写记忆网络、逆视觉问答、视觉推理、可解释性、注意力机制、计数

【论文推荐】最新7篇视觉问答（VQA）相关论文—解释、读写记忆网络、逆视觉问答、视觉推理、可解释性、注意力机制、计数

专知

30+阅读 · 2018年3月22日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

Densely Guided Knowledge Distillation using Multiple Teacher Assistants

Arxiv

0+阅读 · 2021年8月9日

Enhancing the Transferability of Adversarial Attacks through Variance Tuning

Arxiv

4+阅读 · 2021年3月29日

Dynamic Neuro-Symbolic Knowledge Graph Construction for Zero-shot Commonsense Question Answering

Dynamic Neuro-Symbolic Knowledge Graph Construction for Zero-shot Commonsense Question Answering

Arxiv

6+阅读 · 2020年10月30日

Sunny and Dark Outside?! Improving Answer Consistency in VQA through Entailed Question Generation

Sunny and Dark Outside?! Improving Answer Consistency in VQA through Entailed Question Generation

Arxiv

3+阅读 · 2019年9月10日

Language Modeling with Deep Transformers

Arxiv

6+阅读 · 2019年7月11日

Incremental Reading for Question Answering

Incremental Reading for Question Answering

Arxiv

5+阅读 · 2019年1月15日

Re-Identification with Consistent Attentive Siamese Networks

Re-Identification with Consistent Attentive Siamese Networks

Arxiv

8+阅读 · 2018年11月23日

Bilinear Attention Networks

Arxiv

11+阅读 · 2018年5月21日

Probabilistic Embedding of Knowledge Graphs with Box Lattice Measures

Arxiv

6+阅读 · 2018年5月17日

Memory Networks

Arxiv

3+阅读 · 2015年11月29日

VIP会员

文章信息

相关主题

语言模型化

相关VIP内容

【南京大学】量子计算 (Spring 2021)课程

专知会员服务

59+阅读 · 2021年4月12日

【斯坦福CS224N硬核课】自然语言生成NLG，79页ppt

专知会员服务

37+阅读 · 2021年2月22日

Python编程基础，121页ppt

Python编程基础，121页ppt

专知会员服务

49+阅读 · 2021年1月1日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

【Python Tricks新书】The book: A Buffet of Awesome Python Features，299页pdf

【Python Tricks新书】The book: A Buffet of Awesome Python Features，299页pdf

专知会员服务

45+阅读 · 2020年1月1日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

开源书：PyTorch深度学习起步

开源书：PyTorch深度学习起步

专知会员服务

51+阅读 · 2019年10月11日

2019年机器学习框架回顾

2019年机器学习框架回顾

专知会员服务

36+阅读 · 2019年10月11日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【博士论文】扩展可扩展会话推荐的边界

别想太多：高效 R1 风格大型推理模型综述

【ACMMM2025】EvoVLMA: 进化式视觉-语言模型自适应

智能体网络：用AI智能体编织下一代网络

相关资讯

意识是一种数学模式

意识是一种数学模式

CreateAMind

3+阅读 · 2019年6月24日

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

AINLP

40+阅读 · 2019年6月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

Facebook PyText 在 Github 上开源了

Facebook PyText 在 Github 上开源了

AINLP

7+阅读 · 2018年12月14日

【泡泡一分钟】一种实用且高效的多视图匹配方法

【泡泡一分钟】一种实用且高效的多视图匹配方法

泡泡机器人SLAM

6+阅读 · 2018年11月19日

【论文推荐】最新八篇视频描述生成相关论文—在线视频理解、联合定位和描述事件、生成视频、跨模态注意力机制、联合事件检测和描述

【论文推荐】最新八篇视频描述生成相关论文—在线视频理解、联合定位和描述事件、生成视频、跨模态注意力机制、联合事件检测和描述

专知

11+阅读 · 2018年6月4日

【论文推荐】最新7篇视觉问答（VQA）相关论文—解释、读写记忆网络、逆视觉问答、视觉推理、可解释性、注意力机制、计数

【论文推荐】最新7篇视觉问答（VQA）相关论文—解释、读写记忆网络、逆视觉问答、视觉推理、可解释性、注意力机制、计数

专知

30+阅读 · 2018年3月22日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

相关论文

Densely Guided Knowledge Distillation using Multiple Teacher Assistants

Arxiv

0+阅读 · 2021年8月9日

Enhancing the Transferability of Adversarial Attacks through Variance Tuning

Arxiv

4+阅读 · 2021年3月29日

Dynamic Neuro-Symbolic Knowledge Graph Construction for Zero-shot Commonsense Question Answering

Dynamic Neuro-Symbolic Knowledge Graph Construction for Zero-shot Commonsense Question Answering

Arxiv

6+阅读 · 2020年10月30日

Sunny and Dark Outside?! Improving Answer Consistency in VQA through Entailed Question Generation

Sunny and Dark Outside?! Improving Answer Consistency in VQA through Entailed Question Generation

Arxiv

3+阅读 · 2019年9月10日

Language Modeling with Deep Transformers

Arxiv

6+阅读 · 2019年7月11日

Incremental Reading for Question Answering

Incremental Reading for Question Answering

Arxiv

5+阅读 · 2019年1月15日

Re-Identification with Consistent Attentive Siamese Networks

Re-Identification with Consistent Attentive Siamese Networks

Arxiv

8+阅读 · 2018年11月23日

Bilinear Attention Networks

Arxiv

11+阅读 · 2018年5月21日

Probabilistic Embedding of Knowledge Graphs with Box Lattice Measures

Arxiv

6+阅读 · 2018年5月17日

Memory Networks

Arxiv

3+阅读 · 2015年11月29日

微信扫码咨询专知VIP会员