GPT-2关注模式的英文和远距离预测者 (Entropy- and Distance-Based Predictors From GPT-2 Attention Patterns Predict Reading Times Over and Above GPT-2 Surprisal) - 专知论文

会员服务 ·

0

预测器/决策函数 · GPT-2 · Attention · MS · 语言模型化 ·

2022 年 12 月 21 日

Entropy- and Distance-Based Predictors From GPT-2 Attention Patterns Predict Reading Times Over and Above GPT-2 Surprisal

翻译：GPT-2关注模式的英文和远距离预测者

Byung-Doh Oh,William Schuler

from arxiv, EMNLP 2022

Transformer-based large language models are trained to make predictions about the next word by aggregating representations of previous tokens through their self-attention mechanism. In the field of cognitive modeling, such attention patterns have recently been interpreted as embodying the process of cue-based retrieval, in which attention over multiple targets is taken to generate interference and latency during retrieval. Under this framework, this work first defines an entropy-based predictor that quantifies the diffuseness of self-attention, as well as distance-based predictors that capture the incremental change in attention patterns across timesteps. Moreover, following recent studies that question the informativeness of attention weights, we also experiment with alternative methods for incorporating vector norms into attention weights. Regression experiments using predictors calculated from the GPT-2 language model show that these predictors deliver a substantially better fit to held-out self-paced reading and eye-tracking data over a rigorous baseline including GPT-2 surprisal. Additionally, the distance-based predictors generally demonstrated higher predictive power, with effect sizes of up to 6.59 ms per standard deviation on self-paced reading times (compared to 2.82 ms for surprisal) and 1.05 ms per standard deviation on eye-gaze durations (compared to 3.81 ms for surprisal).

翻译：以变异器为基础的大型语言模型经过培训,通过自我注意机制汇总先前象征物的表示方式,对下一个词作出预测。在认知模型领域,这种关注模式最近被解释为体现基于提示的检索过程,其中对多个目标的注意,以在检索过程中产生干扰和潜伏。在此框架下,这项工作首先定义了一种基于酶基的预测器,该预测器可以量化自我注意的分散性,以及基于距离的预测器,以捕捉跨时间步骤关注模式的渐进变化。此外,在最近进行的研究质疑关注重量的知情性之后,我们还试验了将病媒规范纳入注意重量的替代方法。使用从GPT-2语言模型计算的预测器进行倒退实验表明,这些预测器比严格基线(包括GPT-2 偏差) 的固态读取和跟踪数据更适合。此外,基于距离的预测器一般显示更高的预测力,在每标准偏差水平偏差每标准偏差1.51(相对于每标准偏差2.51米)上,预测力高达6.59米。

0

相关内容

预测器/决策函数

预测器/决策函数

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

93+阅读 · 2020年2月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium5

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium5

中国图象图形学学会CSIG

1+阅读 · 2021年11月11日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

基于实时fMRI解码与脑网络建模的听觉信息认知加工机理研究

国家自然科学基金

0+阅读 · 2015年12月31日

拓扑绝缘体与超导体耦合体系中交叉Andreev反射研究

国家自然科学基金

1+阅读 · 2014年12月31日

Degasperis-Procesi方程若干控制问题的研究

国家自然科学基金

0+阅读 · 2012年12月31日

ING3：原发性肝癌的诊断与治疗新靶点

国家自然科学基金

0+阅读 · 2012年12月31日

人脑边缘区结构及功能磁共振成像研究

国家自然科学基金

1+阅读 · 2012年12月31日

强磁场下大型哺乳动物新型磁共振成像技术与方法的开发与应用

国家自然科学基金

0+阅读 · 2012年12月31日

自带冠叶片碰撞振动的非线性动力学研究

国家自然科学基金

0+阅读 · 2009年12月31日

强磁场对铁磁合金调幅分解行为的影响及机理研究

国家自然科学基金

0+阅读 · 2009年12月31日

功能磁共振成像和神经导航的微创神经外科学研究

国家自然科学基金

0+阅读 · 2008年12月31日

磁性Pickering乳液界面流变学研究

国家自然科学基金

0+阅读 · 2008年12月31日

HINormer: Representation Learning On Heterogeneous Information Networks with Graph Transformer

Arxiv

0+阅读 · 2023年2月22日

Time Series Clustering with an EM algorithm for Mixtures of Linear Gaussian State Space Models

Arxiv

0+阅读 · 2023年2月22日

Discovering Optimal Scoring Mechanisms in Causal Strategic Prediction

Arxiv

0+阅读 · 2023年2月21日

Causal Discovery of Dynamic Models for Predicting Human Spatial Interactions

Arxiv

0+阅读 · 2023年2月20日

Learning to Increase the Power of Conditional Randomization Tests

Arxiv

0+阅读 · 2023年2月19日

Greedy Ordering of Layer Weight Matrices in Transformers Improves Translation

Arxiv

0+阅读 · 2023年2月19日

A Simplistic Model of Neural Scaling Laws: Multiperiodic Santa Fe Processes

Arxiv

0+阅读 · 2023年2月17日

Pretrained Transformers for Text Ranking: BERT and Beyond

Arxiv

28+阅读 · 2020年10月13日

Question Answering over Freebase via Attentive RNN with Similarity Matrix based CNN

Arxiv

11+阅读 · 2018年5月27日

Ripple Network: Propagating User Preferences on the Knowledge Graph for Recommender Systems

Arxiv

14+阅读 · 2018年5月19日

VIP会员

文章信息

相关主题

预测器/决策函数

语言模型化

相关VIP内容

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

93+阅读 · 2020年2月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《战区安全决策课程体系》最新244页

《"无人机航母"原型平台》

任务规划与地形分析：现代复杂环境作战导航体系

《攻击场景描述形式化模型研究》

相关资讯

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium5

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium5

中国图象图形学学会CSIG

1+阅读 · 2021年11月11日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

HINormer: Representation Learning On Heterogeneous Information Networks with Graph Transformer

Arxiv

0+阅读 · 2023年2月22日

Time Series Clustering with an EM algorithm for Mixtures of Linear Gaussian State Space Models

Arxiv

0+阅读 · 2023年2月22日

Discovering Optimal Scoring Mechanisms in Causal Strategic Prediction

Arxiv

0+阅读 · 2023年2月21日

Causal Discovery of Dynamic Models for Predicting Human Spatial Interactions

Arxiv

0+阅读 · 2023年2月20日

Learning to Increase the Power of Conditional Randomization Tests

Arxiv

0+阅读 · 2023年2月19日

Greedy Ordering of Layer Weight Matrices in Transformers Improves Translation

Arxiv

0+阅读 · 2023年2月19日

A Simplistic Model of Neural Scaling Laws: Multiperiodic Santa Fe Processes

Arxiv

0+阅读 · 2023年2月17日

Pretrained Transformers for Text Ranking: BERT and Beyond

Arxiv

28+阅读 · 2020年10月13日

Question Answering over Freebase via Attentive RNN with Similarity Matrix based CNN

Arxiv

11+阅读 · 2018年5月27日

Ripple Network: Propagating User Preferences on the Knowledge Graph for Recommender Systems

Arxiv

14+阅读 · 2018年5月19日

相关基金

基于实时fMRI解码与脑网络建模的听觉信息认知加工机理研究

国家自然科学基金

0+阅读 · 2015年12月31日

拓扑绝缘体与超导体耦合体系中交叉Andreev反射研究

国家自然科学基金

1+阅读 · 2014年12月31日

Degasperis-Procesi方程若干控制问题的研究

国家自然科学基金

0+阅读 · 2012年12月31日

ING3：原发性肝癌的诊断与治疗新靶点

国家自然科学基金

0+阅读 · 2012年12月31日

人脑边缘区结构及功能磁共振成像研究

国家自然科学基金

1+阅读 · 2012年12月31日

强磁场下大型哺乳动物新型磁共振成像技术与方法的开发与应用

国家自然科学基金

0+阅读 · 2012年12月31日

自带冠叶片碰撞振动的非线性动力学研究

国家自然科学基金

0+阅读 · 2009年12月31日

强磁场对铁磁合金调幅分解行为的影响及机理研究

国家自然科学基金

0+阅读 · 2009年12月31日

功能磁共振成像和神经导航的微创神经外科学研究

国家自然科学基金

0+阅读 · 2008年12月31日

磁性Pickering乳液界面流变学研究

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员