发现在BERT中发现的晚期概念 (Discovering Latent Concepts Learned in BERT) - 专知论文

会员服务 ·

0

学成 · MoDELS · 潜在 · BERT · BCN ·

2022 年 5 月 15 日

Discovering Latent Concepts Learned in BERT

翻译：发现在BERT中发现的晚期概念

Fahim Dalvi,Abdul Rafae Khan,Firoj Alam,Nadir Durrani,Jia Xu,Hassan Sajjad

from arxiv, ICLR 2022

A large number of studies that analyze deep neural network models and their ability to encode various linguistic and non-linguistic concepts provide an interpretation of the inner mechanics of these models. The scope of the analyses is limited to pre-defined concepts that reinforce the traditional linguistic knowledge and do not reflect on how novel concepts are learned by the model. We address this limitation by discovering and analyzing latent concepts learned in neural network models in an unsupervised fashion and provide interpretations from the model's perspective. In this work, we study: i) what latent concepts exist in the pre-trained BERT model, ii) how the discovered latent concepts align or diverge from classical linguistic hierarchy and iii) how the latent concepts evolve across layers. Our findings show: i) a model learns novel concepts (e.g. animal categories and demographic groups), which do not strictly adhere to any pre-defined categorization (e.g. POS, semantic tags), ii) several latent concepts are based on multiple properties which may include semantics, syntax, and morphology, iii) the lower layers in the model dominate in learning shallow lexical concepts while the higher layers learn semantic relations and iv) the discovered latent concepts highlight potential biases learned in the model. We also release a novel BERT ConceptNet dataset (BCN) consisting of 174 concept labels and 1M annotated instances.

翻译：分析深层神经网络模型及其将各种语言和非语言概念编码的能力的大量研究,为这些模型的内部机理提供了解释。这些分析的范围限于增强传统语言知识的预设概念,而没有反映模型是如何学习新概念的。我们通过以不受监督的方式发现和分析神经网络模型中发现的潜伏概念,从模型的角度提供解释来解决这一局限性。在这项工作中,我们研究:(一) 预先培训的BERT模型中存在哪些潜在概念,(二) 发现的潜在概念与古典语言等级和(三) 潜在概念如何相容或不同,以及(三) 潜在概念如何跨层次演变。我们的调查结果显示: (一) 模型学习新概念(例如动物类别和人口群体),这些概念并不严格遵循任何预先界定的分类(例如POS、语义标记),(二) 几个潜在概念基于多种特性,其中可能包括语义学、合成学和形态学说,(三) 模型中较低的层次,在学习深层理论概念中,(也学习高层次) 数据库中,在深度数据库中学习了浅层理论概念。

0

相关内容

剑桥大学《数据科学: 原理与实践》课程，附PPT下载

剑桥大学《数据科学: 原理与实践》课程，附PPT下载

专知会员服务

53+阅读 · 2021年1月20日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

2019年机器学习框架回顾

2019年机器学习框架回顾

专知会员服务

36+阅读 · 2019年10月11日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

AINLP

35+阅读 · 2018年11月6日

细菌角蛋白酶KerF降解角蛋白过程与分子机制

国家自然科学基金

0+阅读 · 2015年12月31日

基于Lowrank分解的谱方法和有限差分地震正演模拟

国家自然科学基金

0+阅读 · 2015年12月31日

面孔知觉学习的神经机制

国家自然科学基金

0+阅读 · 2013年12月31日

分泌型金属蛋白酶CLCA在哮喘气道重塑中的作用及机制

国家自然科学基金

0+阅读 · 2013年12月31日

miRNA-92a对Rho激酶调控的动脉粥样硬化血管重构的影响及机制

国家自然科学基金

0+阅读 · 2013年12月31日

NFAT、ATF2、STAT3信号通路与介导的炎性因子在砷诱导膀胱上皮细胞恶性转化中的作用机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

hMSCs定向汗腺细胞分化中TRAF6信号复合物活化不同NF-κB通路的机制

国家自然科学基金

0+阅读 · 2011年12月31日

用于强磁场的位置灵敏型探测器技术研究

国家自然科学基金

0+阅读 · 2011年12月31日

重复频率半导体脉冲功率开关RSD的强场效应与关键技术研究

国家自然科学基金

0+阅读 · 2009年12月31日

表观遗传修饰Wnt/βatenin信号通路在气道重构中的作用及机制研究

国家自然科学基金

0+阅读 · 2009年12月31日

Predicting is not Understanding: Recognizing and Addressing Underspecification in Machine Learning

Arxiv

0+阅读 · 2022年7月6日

De-Biasing Generative Models using Counterfactual Methods

Arxiv

0+阅读 · 2022年7月5日

GSMFlow: Generation Shifts Mitigating Flow for Generalized Zero-Shot Learning

Arxiv

0+阅读 · 2022年7月5日

ViRel: Unsupervised Visual Relations Discovery with Graph-level Analogy

ViRel: Unsupervised Visual Relations Discovery with Graph-level Analogy

Arxiv

0+阅读 · 2022年7月4日

Visual Attention Methods in Deep Learning: An In-Depth Survey

Arxiv

44+阅读 · 2022年4月16日

Transformers in Time Series: A Survey

Arxiv

34+阅读 · 2022年2月15日

Sparsity in Deep Learning: Pruning and growth for efficient inference and training in neural networks

Arxiv

14+阅读 · 2021年1月31日

X-BERT: eXtreme Multi-label Text Classification with BERT

X-BERT: eXtreme Multi-label Text Classification with BERT

Arxiv

12+阅读 · 2019年7月4日

Pre-Training with Whole Word Masking for Chinese BERT

Arxiv

11+阅读 · 2019年6月19日

Dissecting Contextual Word Embeddings: Architecture and Representation

Dissecting Contextual Word Embeddings: Architecture and Representation

Arxiv

22+阅读 · 2018年8月27日

VIP会员

文章信息

相关主题

相关VIP内容

剑桥大学《数据科学: 原理与实践》课程，附PPT下载

剑桥大学《数据科学: 原理与实践》课程，附PPT下载

专知会员服务

53+阅读 · 2021年1月20日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

2019年机器学习框架回顾

2019年机器学习框架回顾

专知会员服务

36+阅读 · 2019年10月11日

热门VIP内容

开通专知VIP会员享更多权益服务

【CMU博士论文】数据驱动决策中的激励、信息与不确定性

DGP双粒度提示框架：图增强大模型助力欺诈检测

【ICCV2025】ESSENTIAL：用于视频类增量学习的情景记忆与语义记忆整合

唯快不破：大型语言模型高效架构综述

相关资讯

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

AINLP

35+阅读 · 2018年11月6日

相关论文

Predicting is not Understanding: Recognizing and Addressing Underspecification in Machine Learning

Arxiv

0+阅读 · 2022年7月6日

De-Biasing Generative Models using Counterfactual Methods

Arxiv

0+阅读 · 2022年7月5日

GSMFlow: Generation Shifts Mitigating Flow for Generalized Zero-Shot Learning

Arxiv

0+阅读 · 2022年7月5日

ViRel: Unsupervised Visual Relations Discovery with Graph-level Analogy

ViRel: Unsupervised Visual Relations Discovery with Graph-level Analogy

Arxiv

0+阅读 · 2022年7月4日

Visual Attention Methods in Deep Learning: An In-Depth Survey

Arxiv

44+阅读 · 2022年4月16日

Transformers in Time Series: A Survey

Arxiv

34+阅读 · 2022年2月15日

Sparsity in Deep Learning: Pruning and growth for efficient inference and training in neural networks

Arxiv

14+阅读 · 2021年1月31日

X-BERT: eXtreme Multi-label Text Classification with BERT

X-BERT: eXtreme Multi-label Text Classification with BERT

Arxiv

12+阅读 · 2019年7月4日

Pre-Training with Whole Word Masking for Chinese BERT

Arxiv

11+阅读 · 2019年6月19日

Dissecting Contextual Word Embeddings: Architecture and Representation

Dissecting Contextual Word Embeddings: Architecture and Representation

Arxiv

22+阅读 · 2018年8月27日

相关基金

细菌角蛋白酶KerF降解角蛋白过程与分子机制

国家自然科学基金

0+阅读 · 2015年12月31日

基于Lowrank分解的谱方法和有限差分地震正演模拟

国家自然科学基金

0+阅读 · 2015年12月31日

面孔知觉学习的神经机制

国家自然科学基金

0+阅读 · 2013年12月31日

分泌型金属蛋白酶CLCA在哮喘气道重塑中的作用及机制

国家自然科学基金

0+阅读 · 2013年12月31日

miRNA-92a对Rho激酶调控的动脉粥样硬化血管重构的影响及机制

国家自然科学基金

0+阅读 · 2013年12月31日

NFAT、ATF2、STAT3信号通路与介导的炎性因子在砷诱导膀胱上皮细胞恶性转化中的作用机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

hMSCs定向汗腺细胞分化中TRAF6信号复合物活化不同NF-κB通路的机制

国家自然科学基金

0+阅读 · 2011年12月31日

用于强磁场的位置灵敏型探测器技术研究

国家自然科学基金

0+阅读 · 2011年12月31日

重复频率半导体脉冲功率开关RSD的强场效应与关键技术研究

国家自然科学基金

0+阅读 · 2009年12月31日

表观遗传修饰Wnt/βatenin信号通路在气道重构中的作用及机制研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员