用于示范解释的忠实度量 (On the Faithfulness Measurements for Model Interpretations) - 专知论文

会员服务 ·

0

Performer · MoDELS · Conformer · Processing（编程语言） · surge ·

2021 年 4 月 18 日

On the Faithfulness Measurements for Model Interpretations

翻译：用于示范解释的忠实度量

Fan Yin,Zhouxing Shi,Cho-Jui Hsieh,Kai-Wei Chang

Recent years have witnessed the emergence of a variety of post-hoc interpretations that aim to uncover how natural language processing (NLP) models make predictions. Despite the surge of new interpretations, it remains an open problem how to define and quantitatively measure the faithfulness of interpretations, i.e., to what extent they conform to the reasoning process behind the model. To tackle these issues, we start with three criteria: the removal-based criterion, the sensitivity of interpretations, and the stability of interpretations, that quantify different notions of faithfulness, and propose novel paradigms to systematically evaluate interpretations in NLP. Our results show that the performance of interpretations under different criteria of faithfulness could vary substantially. Motivated by the desideratum of these faithfulness notions, we introduce a new class of interpretation methods that adopt techniques from the adversarial robustness domain. Empirical results show that our proposed methods achieve top performance under all three criteria. Along with experiments and analysis on both the text classification and the dependency parsing tasks, we come to a more comprehensive understanding of the diverse set of interpretations.

翻译：近些年来,出现了各种事后解释,目的是揭示自然语言处理模式是如何作出预测的。尽管新解释激增,但如何界定和定量衡量解释的忠实性仍然是一个未解决的问题,即这些解释在多大程度上符合模型背后的推理过程。为了解决这些问题,我们从三个标准着手:基于删除的标准、解释的敏感性和解释的稳定性,用数量表示不同的忠诚概念,并提出新的范式,以便系统地评价国家语言处理模式的解释。我们的结果表明,在不同忠诚标准下的解释性的表现可能有很大差异。我们受这些忠诚概念的贬低的驱使,我们采用了一种新的解释方法,采用从对抗性强势领域得出的技术。经验性结果显示,我们提出的方法在所有三个标准下都取得了顶尖的成绩。除了对文本分类和依赖性区分任务进行试验和分析外,我们还更全面地了解了多种解释。

0

相关内容

Performer

近期必读的六篇AAAI 2021【因果推理】相关论文和代码

专知会员服务

73+阅读 · 2021年1月12日

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

【CHI2020-微软】解释可解释性:理解数据科学家使用机器学习的可解释性工具，Interpreting Interpretability: Understanding Data Scientists’Use of Interpretability Tools for Machine Learning

【CHI2020-微软】解释可解释性:理解数据科学家使用机器学习的可解释性工具，Interpreting Interpretability: Understanding Data Scientists’Use of Interpretability Tools for Machine Learning

专知会员服务

55+阅读 · 2020年3月8日

【干货】用BRET进行多标签文本分类（附代码）

【干货】用BRET进行多标签文本分类（附代码）

专知会员服务

85+阅读 · 2019年12月27日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

【新书】Python编程基础，669页pdf

【新书】Python编程基础，669页pdf

专知会员服务

196+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

53+阅读 · 2019年9月29日

灾难性遗忘问题新视角：迁移-干扰平衡

灾难性遗忘问题新视角：迁移-干扰平衡

CreateAMind

17+阅读 · 2019年7月6日

可解释AI(XAI)工具集—DrWhy

可解释AI(XAI)工具集—DrWhy

专知

25+阅读 · 2019年6月4日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

IEEE | DSC 2019诚邀稿件 (EI检索)

IEEE | DSC 2019诚邀稿件 (EI检索)

Call4Papers

10+阅读 · 2019年2月25日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

On Sample Based Explanation Methods for NLP:Efficiency, Faithfulness, and Semantic Evaluation

Arxiv

0+阅读 · 2021年6月9日

On the Lack of Robust Interpretability of Neural Text Classifiers

Arxiv

0+阅读 · 2021年6月8日

Is Sparse Attention more Interpretable?

Arxiv

0+阅读 · 2021年6月8日

A Formal Causal Interpretation of the Case-Crossover Design

Arxiv

0+阅读 · 2021年6月7日

Towards Rigorous Interpretations: a Formalisation of Feature Attribution

Arxiv

4+阅读 · 2021年4月26日

The Causal Learning of Retail Delinquency

Arxiv

14+阅读 · 2020年12月17日

A Survey on Explainable Artificial Intelligence (XAI): Towards Medical XAI

A Survey on Explainable Artificial Intelligence (XAI): Towards Medical XAI

Arxiv

5+阅读 · 2019年10月15日

Interpretable machine learning: definitions, methods, and applications

Interpretable machine learning: definitions, methods, and applications

Arxiv

19+阅读 · 2019年1月14日

Interpretable Active Learning

Interpretable Active Learning

Arxiv

3+阅读 · 2018年6月24日

Discrete Autoencoders for Sequence Models

Arxiv

6+阅读 · 2018年1月29日

VIP会员

文章信息

相关主题

Processing（编程语言）

相关VIP内容

近期必读的六篇AAAI 2021【因果推理】相关论文和代码

专知会员服务

73+阅读 · 2021年1月12日

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

【CHI2020-微软】解释可解释性:理解数据科学家使用机器学习的可解释性工具，Interpreting Interpretability: Understanding Data Scientists’Use of Interpretability Tools for Machine Learning

【CHI2020-微软】解释可解释性:理解数据科学家使用机器学习的可解释性工具，Interpreting Interpretability: Understanding Data Scientists’Use of Interpretability Tools for Machine Learning

专知会员服务

55+阅读 · 2020年3月8日

【干货】用BRET进行多标签文本分类（附代码）

【干货】用BRET进行多标签文本分类（附代码）

专知会员服务

85+阅读 · 2019年12月27日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

【新书】Python编程基础，669页pdf

【新书】Python编程基础，669页pdf

专知会员服务

196+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

53+阅读 · 2019年9月29日

热门VIP内容

开通专知VIP会员享更多权益服务

从社会学实验到行为仿真：理解基于Agent的观点动力学建模思维

中英文版《GPT-5 System Card速览》报告

ACL 2025 | 大模型结构化知识提示的泛化能力研究

【普林斯顿博士论文】大型模型的高效推理

相关资讯

灾难性遗忘问题新视角：迁移-干扰平衡

灾难性遗忘问题新视角：迁移-干扰平衡

CreateAMind

17+阅读 · 2019年7月6日

可解释AI(XAI)工具集—DrWhy

可解释AI(XAI)工具集—DrWhy

专知

25+阅读 · 2019年6月4日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

IEEE | DSC 2019诚邀稿件 (EI检索)

IEEE | DSC 2019诚邀稿件 (EI检索)

Call4Papers

10+阅读 · 2019年2月25日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

相关论文

On Sample Based Explanation Methods for NLP:Efficiency, Faithfulness, and Semantic Evaluation

Arxiv

0+阅读 · 2021年6月9日

On the Lack of Robust Interpretability of Neural Text Classifiers

Arxiv

0+阅读 · 2021年6月8日

Is Sparse Attention more Interpretable?

Arxiv

0+阅读 · 2021年6月8日

A Formal Causal Interpretation of the Case-Crossover Design

Arxiv

0+阅读 · 2021年6月7日

Towards Rigorous Interpretations: a Formalisation of Feature Attribution

Arxiv

4+阅读 · 2021年4月26日

The Causal Learning of Retail Delinquency

Arxiv

14+阅读 · 2020年12月17日

A Survey on Explainable Artificial Intelligence (XAI): Towards Medical XAI

A Survey on Explainable Artificial Intelligence (XAI): Towards Medical XAI

Arxiv

5+阅读 · 2019年10月15日

Interpretable machine learning: definitions, methods, and applications

Interpretable machine learning: definitions, methods, and applications

Arxiv

19+阅读 · 2019年1月14日

Interpretable Active Learning

Interpretable Active Learning

Arxiv

3+阅读 · 2018年6月24日

Discrete Autoencoders for Sequence Models

Arxiv

6+阅读 · 2018年1月29日

微信扫码咨询专知VIP会员