分析嵌入空间中的变形器 (Analyzing Transformers in Embedding Space) - 专知论文

会员服务 ·

0

Analysis · MoDELS · Attention · 变换 · 词表 ·

2022 年 9 月 6 日

Analyzing Transformers in Embedding Space

翻译：分析嵌入空间中的变形器

Guy Dar,Mor Geva,Ankit Gupta,Jonathan Berant

Understanding Transformer-based models has attracted significant attention, as they lie at the heart of recent technological advances across machine learning. While most interpretability methods rely on running models over inputs, recent work has shown that a zero-pass approach, where parameters are interpreted directly without a forward/backward pass is feasible for some Transformer parameters, and for two-layer attention networks. In this work, we present a theoretical analysis where all parameters of a trained Transformer are interpreted by projecting them into the embedding space, that is, the space of vocabulary items they operate on. We derive a simple theoretical framework to support our arguments and provide ample evidence for its validity. First, an empirical analysis showing that parameters of both pretrained and fine-tuned models can be interpreted in embedding space. Second, we present two applications of our framework: (a) aligning the parameters of different models that share a vocabulary, and (b) constructing a classifier without training by ``translating'' the parameters of a fine-tuned classifier to parameters of a different model that was only pretrained. Overall, our findings open the door to interpretation methods that, at least in part, abstract away from model specifics and operate in the embedding space only.

翻译：理解基于变异器的模型吸引了人们的极大关注,因为这些模型是跨机器学习的最新技术进步的核心所在。虽然大多数可解释性方法依靠的是运行模型而不是投入,但最近的工作表明,对一些变异器参数和两层关注网络来说,可以采用零通过方法,即直接解释参数,而没有前向/后向过过往的参数。在这项工作中,我们提出了一个理论分析,将经过训练的变异器的所有参数投射到嵌入空间,即它们运行的词汇项目空间,从而将其所有参数都解释成。我们从一个简单的理论框架中获取一个支持我们的论点的简单理论框架,并为它的有效性提供充足的证据。首先,一项经验分析表明预先培训和经过精细调整的模型的参数可以在嵌入空间中加以解释。第二,我们提出了我们框架的两种应用:(a) 将共享词汇的不同模型的参数加以协调,以及(b) 将精细调的分类器的参数投放到一个只是预先训练的不同模型的参数上。总体而言,我们的调查结果打开了解释方法的大门,至少在部分上将空间抽象地从特定的嵌入空间和操作中进行。

0

相关内容

Analysis

NeurlPS 2022 | 自然语言处理相关论文分类整理

NeurlPS 2022 | 自然语言处理相关论文分类整理

专知会员服务

51+阅读 · 2022年10月2日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

93+阅读 · 2020年2月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

BERT/Transformer/迁移学习NLP资源大列表

BERT/Transformer/迁移学习NLP资源大列表

专知

19+阅读 · 2019年6月9日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

基于PyTorch/TorchText的自然语言处理库

基于PyTorch/TorchText的自然语言处理库

专知

28+阅读 · 2019年4月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

AINLP

35+阅读 · 2018年11月6日

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

全球人工智能

20+阅读 · 2017年12月17日

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

机器学习研究会

20+阅读 · 2017年12月17日

第七届全国数学文化论坛

国家自然科学基金

3+阅读 · 2016年12月31日

中国淡水桥弯藻（Cymbelloid）植物分类学研究

国家自然科学基金

1+阅读 · 2014年12月31日

非球形冰晶粒子光散射和甲烷高光谱卫星遥感反演的研究

国家自然科学基金

0+阅读 · 2014年12月31日

壳层隔绝纳米粒子增强拉曼光谱在近海海域污染物快速检测中应用的基础研究

国家自然科学基金

0+阅读 · 2013年12月31日

筛选抑制肾癌生长的SPOP靶向小分子抑制剂

国家自然科学基金

0+阅读 · 2013年12月31日

PSCA核酸适体靶向纳米探针可视化前列腺癌转移灶的研究

国家自然科学基金

0+阅读 · 2012年12月31日

靶向ApoE的AD药物筛选及相关作用机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于氯通道差异性表达的Disulfiram-Cu靶向抗肿瘤作用及其机制

国家自然科学基金

0+阅读 · 2012年12月31日

新型靶向VPAC1高亲和力多肽的结直肠癌分子显像应用基础研究

国家自然科学基金

0+阅读 · 2012年12月31日

生物分子连续模型中的数值方法与程序实现

国家自然科学基金

0+阅读 · 2009年12月31日

Constraint Learning to Define Trust Regions in Predictive-Model Embedded Optimization

Arxiv

0+阅读 · 2022年10月19日

Exploitability Minimization in Games and Beyond

Arxiv

0+阅读 · 2022年10月18日

Comparing Embedded Graphs Using Average Branching Distance

Arxiv

0+阅读 · 2022年10月18日

Anticipating Performativity by Predicting from Predictions

Arxiv

0+阅读 · 2022年10月18日

Pretrained Transformers for Text Ranking: BERT and Beyond

Arxiv

28+阅读 · 2020年10月13日

Representation Learning with Ordered Relation Paths for Knowledge Graph Completion

Representation Learning with Ordered Relation Paths for Knowledge Graph Completion

Arxiv

12+阅读 · 2019年9月26日

Learning Attention-based Embeddings for Relation Prediction in Knowledge Graphs

Arxiv

40+阅读 · 2019年6月4日

From Knowledge Graph Embedding to Ontology Embedding: Region Based Representations of Relational Structures

Arxiv

10+阅读 · 2018年5月26日

Global Relation Embedding for Relation Extraction

Arxiv

10+阅读 · 2018年4月19日

A Structured Self-attentive Sentence Embedding

Arxiv

24+阅读 · 2017年3月9日

VIP会员

文章信息

相关主题

相关VIP内容

NeurlPS 2022 | 自然语言处理相关论文分类整理

NeurlPS 2022 | 自然语言处理相关论文分类整理

专知会员服务

51+阅读 · 2022年10月2日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

93+阅读 · 2020年2月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《物联网（IoT）中的无人机通信高效控制》135页

《在GNSS信号降级环境中利用共识实现无人机集群稳健协调》

中程单向攻击无人机的战略意义：俄乌战争启示

《面向无人机集群的避障动态传感器覆盖算法》最新38页

相关资讯

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

BERT/Transformer/迁移学习NLP资源大列表

BERT/Transformer/迁移学习NLP资源大列表

专知

19+阅读 · 2019年6月9日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

基于PyTorch/TorchText的自然语言处理库

基于PyTorch/TorchText的自然语言处理库

专知

28+阅读 · 2019年4月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

AINLP

35+阅读 · 2018年11月6日

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

全球人工智能

20+阅读 · 2017年12月17日

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

机器学习研究会

20+阅读 · 2017年12月17日

相关论文

Constraint Learning to Define Trust Regions in Predictive-Model Embedded Optimization

Arxiv

0+阅读 · 2022年10月19日

Exploitability Minimization in Games and Beyond

Arxiv

0+阅读 · 2022年10月18日

Comparing Embedded Graphs Using Average Branching Distance

Arxiv

0+阅读 · 2022年10月18日

Anticipating Performativity by Predicting from Predictions

Arxiv

0+阅读 · 2022年10月18日

Pretrained Transformers for Text Ranking: BERT and Beyond

Arxiv

28+阅读 · 2020年10月13日

Representation Learning with Ordered Relation Paths for Knowledge Graph Completion

Representation Learning with Ordered Relation Paths for Knowledge Graph Completion

Arxiv

12+阅读 · 2019年9月26日

Learning Attention-based Embeddings for Relation Prediction in Knowledge Graphs

Arxiv

40+阅读 · 2019年6月4日

From Knowledge Graph Embedding to Ontology Embedding: Region Based Representations of Relational Structures

Arxiv

10+阅读 · 2018年5月26日

Global Relation Embedding for Relation Extraction

Arxiv

10+阅读 · 2018年4月19日

A Structured Self-attentive Sentence Embedding

Arxiv

24+阅读 · 2017年3月9日

相关基金

第七届全国数学文化论坛

国家自然科学基金

3+阅读 · 2016年12月31日

中国淡水桥弯藻（Cymbelloid）植物分类学研究

国家自然科学基金

1+阅读 · 2014年12月31日

非球形冰晶粒子光散射和甲烷高光谱卫星遥感反演的研究

国家自然科学基金

0+阅读 · 2014年12月31日

壳层隔绝纳米粒子增强拉曼光谱在近海海域污染物快速检测中应用的基础研究

国家自然科学基金

0+阅读 · 2013年12月31日

筛选抑制肾癌生长的SPOP靶向小分子抑制剂

国家自然科学基金

0+阅读 · 2013年12月31日

PSCA核酸适体靶向纳米探针可视化前列腺癌转移灶的研究

国家自然科学基金

0+阅读 · 2012年12月31日

靶向ApoE的AD药物筛选及相关作用机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于氯通道差异性表达的Disulfiram-Cu靶向抗肿瘤作用及其机制

国家自然科学基金

0+阅读 · 2012年12月31日

新型靶向VPAC1高亲和力多肽的结直肠癌分子显像应用基础研究

国家自然科学基金

0+阅读 · 2012年12月31日

生物分子连续模型中的数值方法与程序实现

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员