被训练的Transformer语言模型能否吃下RuCoLA？拓扑数据分析的解释方法 (Can BERT eat RuCoLA? Topological Data Analysis to Explain) - 专知论文

会员服务 ·

0

拓扑数据分析 · R语言 · tuning · 数据分析 · 分析 ·

2023 年 4 月 4 日

Can BERT eat RuCoLA? Topological Data Analysis to Explain

翻译：被训练的Transformer语言模型能否吃下RuCoLA？拓扑数据分析的解释方法

Irina Proskurina,Irina Piontkovskaya,Ekaterina Artemova

from arxiv, Accepted to the Workshop on Slavic NLP @ EACL 2023

This paper investigates how Transformer language models (LMs) fine-tuned for acceptability classification capture linguistic features. Our approach uses the best practices of topological data analysis (TDA) in NLP: we construct directed attention graphs from attention matrices, derive topological features from them, and feed them to linear classifiers. We introduce two novel features, chordality, and the matching number, and show that TDA-based classifiers outperform fine-tuning baselines. We experiment with two datasets, CoLA and RuCoLA in English and Russian, typologically different languages. On top of that, we propose several black-box introspection techniques aimed at detecting changes in the attention mode of the LMs during fine-tuning, defining the LM's prediction confidences, and associating individual heads with fine-grained grammar phenomena. Our results contribute to understanding the behavior of monolingual LMs in the acceptability classification task, provide insights into the functional roles of attention heads, and highlight the advantages of TDA-based approaches for analyzing LMs. We release the code and the experimental results for further uptake.

翻译：本文探究了针对可接受性分类进行fine-tune的Transformer语言模型 (LMs) 如何捕捉语言特征。我们采用基于 NLP 的拓扑数据分析 (TDA) 最佳实践方法：从注意力矩阵构建有向的注意力图，从中获得拓扑特征，并将其提供给线性分类器。我们引入了两种新的特征，即和弦性和匹配数，并表明基于TDA的分类器优于fine-tuning的基线方法。我们尝试了两个数据集，英语和俄语的 CoLA 和 RuCoLA，这是两种语系截然不同的语言。此外，我们提出了几种黑盒自省技术，旨在检测fine-tuning过程中LM的注意力模式的变化，定义LM的预测置信度，并将个别attention头部与细粒度语法现象联系起来。我们的结果有助于理解单语LM在接受性分类任务中的行为，提供了注意力头部的功能角色见解，并突出了分析LM的基于TDA的方法的优势。我们将代码和实验结果释放供进一步采纳。

0

相关内容

拓扑数据分析

拓扑数据分析

【2022新书】transformer自然语言处理简介：用Hugging Face库和模型来解决问题，169页pdf

【2022新书】transformer自然语言处理简介：用Hugging Face库和模型来解决问题，169页pdf

专知会员服务

130+阅读 · 2022年11月2日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

【AAAI2021教程】常识知识获取与表示，USC/斯坦福等学者讲述，250页ppt

【AAAI2021教程】常识知识获取与表示，USC/斯坦福等学者讲述，250页ppt

专知会员服务

61+阅读 · 2021年2月4日

近期必读的六篇AAAI 2021【因果推理】相关论文和代码

专知会员服务

73+阅读 · 2021年1月12日

【CIKM2020】神经逻辑推理，Neural Logic Reasoning

【CIKM2020】神经逻辑推理，Neural Logic Reasoning

专知会员服务

51+阅读 · 2020年8月25日

KG-BERT：基于BERT的知识图谱补全，KG-BERT: BERT for Knowledge Graph Completion

KG-BERT：基于BERT的知识图谱补全，KG-BERT: BERT for Knowledge Graph Completion

专知会员服务

195+阅读 · 2020年5月31日

【SIGMOD2020】知识图谱补全方法的现实再评价，Realistic Re-evaluation of Knowledge Graph Completion Methods: An Experimental Study

【SIGMOD2020】知识图谱补全方法的现实再评价，Realistic Re-evaluation of Knowledge Graph Completion Methods: An Experimental Study

专知会员服务

33+阅读 · 2020年3月23日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

谷歌&HuggingFace| 零样本能力最强的语言模型结构

谷歌&HuggingFace| 零样本能力最强的语言模型结构

夕小瑶的卖萌屋

0+阅读 · 2022年6月23日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

AINLP

40+阅读 · 2019年6月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

论文 | 15篇近期值得读的AI论文

论文 | 15篇近期值得读的AI论文

黑龙江大学自然语言处理实验室

16+阅读 · 2018年2月12日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

原创 | Attention Modeling for Targeted Sentiment

原创 | Attention Modeling for Targeted Sentiment

黑龙江大学自然语言处理实验室

25+阅读 · 2017年11月5日

意大利蜜蜂级型分化关键基因Dnmt3启动子的分析及其上游转录调控因子的鉴定

国家自然科学基金

0+阅读 · 2013年12月31日

蛋白激酶GsCBRLK在大豆盐胁迫信号转导途径中的调控机理研究

国家自然科学基金

0+阅读 · 2013年12月31日

Lee偏差在试验设计中的应用研究

国家自然科学基金

0+阅读 · 2013年12月31日

油菜进化过程中A9染色体的结构变化

国家自然科学基金

0+阅读 · 2012年12月31日

冷胁迫诱导柽柳ThCAP基因表达的分子机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

柽柳Dof转录因子的耐盐调控机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

Ca2+信号通路介导猪骨髓MSCs成脂分化的分子机制及其营养调控

国家自然科学基金

0+阅读 · 2012年12月31日

实时安全关键系统的建模、仿真与验证

国家自然科学基金

1+阅读 · 2012年12月31日

动脉粥样硬化中PPARγ19978;调c-Ski的机制及作用研究

国家自然科学基金

0+阅读 · 2010年12月31日

c-Myc及Cyclin A2诱导豚鼠耳蜗前体细胞增殖的实验研究

国家自然科学基金

0+阅读 · 2008年12月31日

Understanding Arithmetic Reasoning in Language Models using Causal Mediation Analysis

Arxiv

0+阅读 · 2023年5月24日

Evaluation of African American Language Bias in Natural Language Generation

Arxiv

0+阅读 · 2023年5月23日

Improving Stability and Performance of Spiking Neural Networks through Enhancing Temporal Consistency

Arxiv

0+阅读 · 2023年5月23日

Challenging Decoder helps in Masked Auto-Encoder Pre-training for Dense Passage Retrieval

Arxiv

0+阅读 · 2023年5月22日

A Pretrainer's Guide to Training Data: Measuring the Effects of Data Age, Domain Coverage, Quality, & Toxicity

Arxiv

0+阅读 · 2023年5月22日

Federated Learning of Medical Concepts Embedding using BEHRT

Arxiv

0+阅读 · 2023年5月22日

Logical Entity Representation in Knowledge-Graphs for Differentiable Rule Learning

Arxiv

0+阅读 · 2023年5月22日

SurgMAE: Masked Autoencoders for Long Surgical Video Analysis

Arxiv

0+阅读 · 2023年5月19日

Transformers in Medical Image Analysis: A Review

Transformers in Medical Image Analysis: A Review

Arxiv

40+阅读 · 2022年2月24日

PEGASUS: Pre-training with Extracted Gap-sentences for Abstractive Summarization

Arxiv

17+阅读 · 2020年6月2日

VIP会员

文章信息

相关主题

拓扑数据分析

相关VIP内容

【2022新书】transformer自然语言处理简介：用Hugging Face库和模型来解决问题，169页pdf

【2022新书】transformer自然语言处理简介：用Hugging Face库和模型来解决问题，169页pdf

专知会员服务

130+阅读 · 2022年11月2日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

【AAAI2021教程】常识知识获取与表示，USC/斯坦福等学者讲述，250页ppt

【AAAI2021教程】常识知识获取与表示，USC/斯坦福等学者讲述，250页ppt

专知会员服务

61+阅读 · 2021年2月4日

近期必读的六篇AAAI 2021【因果推理】相关论文和代码

专知会员服务

73+阅读 · 2021年1月12日

【CIKM2020】神经逻辑推理，Neural Logic Reasoning

【CIKM2020】神经逻辑推理，Neural Logic Reasoning

专知会员服务

51+阅读 · 2020年8月25日

KG-BERT：基于BERT的知识图谱补全，KG-BERT: BERT for Knowledge Graph Completion

KG-BERT：基于BERT的知识图谱补全，KG-BERT: BERT for Knowledge Graph Completion

专知会员服务

195+阅读 · 2020年5月31日

【SIGMOD2020】知识图谱补全方法的现实再评价，Realistic Re-evaluation of Knowledge Graph Completion Methods: An Experimental Study

【SIGMOD2020】知识图谱补全方法的现实再评价，Realistic Re-evaluation of Knowledge Graph Completion Methods: An Experimental Study

专知会员服务

33+阅读 · 2020年3月23日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《使用量化测量将传感器节点关联到融合中心的算法设计》171页

军事前沿模型

提升军事训练能力的最佳人工智能模拟工具

《社交媒体信息作战》最新48页技术报告

相关资讯

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

谷歌&HuggingFace| 零样本能力最强的语言模型结构

谷歌&HuggingFace| 零样本能力最强的语言模型结构

夕小瑶的卖萌屋

0+阅读 · 2022年6月23日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

AINLP

40+阅读 · 2019年6月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

论文 | 15篇近期值得读的AI论文

论文 | 15篇近期值得读的AI论文

黑龙江大学自然语言处理实验室

16+阅读 · 2018年2月12日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

原创 | Attention Modeling for Targeted Sentiment

原创 | Attention Modeling for Targeted Sentiment

黑龙江大学自然语言处理实验室

25+阅读 · 2017年11月5日

相关论文

Understanding Arithmetic Reasoning in Language Models using Causal Mediation Analysis

Arxiv

0+阅读 · 2023年5月24日

Evaluation of African American Language Bias in Natural Language Generation

Arxiv

0+阅读 · 2023年5月23日

Improving Stability and Performance of Spiking Neural Networks through Enhancing Temporal Consistency

Arxiv

0+阅读 · 2023年5月23日

Challenging Decoder helps in Masked Auto-Encoder Pre-training for Dense Passage Retrieval

Arxiv

0+阅读 · 2023年5月22日

A Pretrainer's Guide to Training Data: Measuring the Effects of Data Age, Domain Coverage, Quality, & Toxicity

Arxiv

0+阅读 · 2023年5月22日

Federated Learning of Medical Concepts Embedding using BEHRT

Arxiv

0+阅读 · 2023年5月22日

Logical Entity Representation in Knowledge-Graphs for Differentiable Rule Learning

Arxiv

0+阅读 · 2023年5月22日

SurgMAE: Masked Autoencoders for Long Surgical Video Analysis

Arxiv

0+阅读 · 2023年5月19日

Transformers in Medical Image Analysis: A Review

Transformers in Medical Image Analysis: A Review

Arxiv

40+阅读 · 2022年2月24日

PEGASUS: Pre-training with Extracted Gap-sentences for Abstractive Summarization

Arxiv

17+阅读 · 2020年6月2日

相关基金

意大利蜜蜂级型分化关键基因Dnmt3启动子的分析及其上游转录调控因子的鉴定

国家自然科学基金

0+阅读 · 2013年12月31日

蛋白激酶GsCBRLK在大豆盐胁迫信号转导途径中的调控机理研究

国家自然科学基金

0+阅读 · 2013年12月31日

Lee偏差在试验设计中的应用研究

国家自然科学基金

0+阅读 · 2013年12月31日

油菜进化过程中A9染色体的结构变化

国家自然科学基金

0+阅读 · 2012年12月31日

冷胁迫诱导柽柳ThCAP基因表达的分子机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

柽柳Dof转录因子的耐盐调控机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

Ca2+信号通路介导猪骨髓MSCs成脂分化的分子机制及其营养调控

国家自然科学基金

0+阅读 · 2012年12月31日

实时安全关键系统的建模、仿真与验证

国家自然科学基金

1+阅读 · 2012年12月31日

动脉粥样硬化中PPARγ19978;调c-Ski的机制及作用研究

国家自然科学基金

0+阅读 · 2010年12月31日

c-Myc及Cyclin A2诱导豚鼠耳蜗前体细胞增殖的实验研究

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员