使用 Shappely 值解释维度减量结果 (Explaining dimensionality reduction results using Shapley values) - 专知论文

会员服务 ·

0

降维 · 簇 · 数据分析 · CASE · 均值 ·

2021 年 3 月 9 日

Explaining dimensionality reduction results using Shapley values

翻译：使用 Shappely 值解释维度减量结果

Wilson Estécio Marcílio Júnior,Danilo Medeiros Eler

Dimensionality reduction (DR) techniques have been consistently supporting high-dimensional data analysis in various applications. Besides the patterns uncovered by these techniques, the interpretation of DR results based on each feature's contribution to the low-dimensional representation supports new finds through exploratory analysis. Current literature approaches designed to interpret DR techniques do not explain the features' contributions well since they focus only on the low-dimensional representation or do not consider the relationship among features. This paper presents ClusterShapley to address these problems, using Shapley values to generate explanations of dimensionality reduction techniques and interpret these algorithms using a cluster-oriented analysis. ClusterShapley explains the formation of clusters and the meaning of their relationship, which is useful for exploratory data analysis in various domains. We propose novel visualization techniques to guide the interpretation of features' contributions on clustering formation and validate our methodology through case studies of publicly available datasets. The results demonstrate our approach's interpretability and analysis power to generate insights about pathologies and patients in different conditions using DR results.

翻译：除了这些技术所发现的模式外,根据每个特征对低维代表度的贡献对DR结果的解释也支持通过探索性分析发现新的发现。目前旨在解释DR技术的文献方法并不能很好地解释这些特征的贡献,因为它们只侧重于低维代表度,或不考虑各种特征之间的关系。本文展示了处理这些问题的群集特征,利用沙普利值来解释减少维度技术,并利用以集群为导向的分析来解释这些算法。群集解释了集群的形成及其关系的含义,这对不同领域的探索性数据分析很有用。我们提出了新的可视化技术来指导对特征对集群形成的贡献的解释,并通过对公开提供的数据集进行个案研究来验证我们的方法。结果表明我们的方法的可解释性和分析能力,以便利用DR结果对不同条件下的病理学和病人产生洞察力。

0

相关内容

降维是将数据从高维空间转换为低维空间，以便低维表示保留原始数据的某些有意义的属性，理想情况下接近其固有维。降维在处理大量观察和/或大量变量的领域很常见，例如信号处理，语音识别，神经信息学和生物信息学。

【普林斯顿经典书】高维概率，326页pdf，Probability in High Dimension

【普林斯顿经典书】高维概率，326页pdf，Probability in High Dimension

专知会员服务

106+阅读 · 2021年2月27日

【ETH】最新《几何数据分析》2020课程，附PPT下载

专知会员服务

45+阅读 · 2020年12月18日

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

2020数据工程师成长路线图

专知会员服务

19+阅读 · 2020年9月6日

(普林斯顿讲义)：高维概率论，326页pdf《Probability in High Dimension》

(普林斯顿讲义)：高维概率论，326页pdf《Probability in High Dimension》

专知会员服务

123+阅读 · 2020年5月30日

【斯坦福大学】面向可解释人工智能:神经网络的显著性检验（Towards Explainable AI: Significance Tests for Neural Networks），26页pdf

【斯坦福大学】面向可解释人工智能:神经网络的显著性检验（Towards Explainable AI: Significance Tests for Neural Networks），26页pdf

专知会员服务

27+阅读 · 2019年12月19日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【IJCAI 2019】细粒度的意见挖掘:当前趋势和前沿维度（Fine-grained Opinion Mining: Current Trend and Cutting-Edge Dimensions），虞剑飞

【IJCAI 2019】细粒度的意见挖掘:当前趋势和前沿维度（Fine-grained Opinion Mining: Current Trend and Cutting-Edge Dimensions），虞剑飞

专知会员服务

26+阅读 · 2019年8月11日

(普林斯顿讲义)：高维概率论，326页pdf《Probability in High Dimension》

(普林斯顿讲义)：高维概率论，326页pdf《Probability in High Dimension》

专知

21+阅读 · 2020年5月30日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【泡泡一分钟】学习紧密的几何特征（ICCV2017-17）

【泡泡一分钟】学习紧密的几何特征（ICCV2017-17）

泡泡机器人SLAM

20+阅读 · 2018年5月8日

笔记 | Sentiment Analysis

笔记 | Sentiment Analysis

黑龙江大学自然语言处理实验室

10+阅读 · 2018年5月6日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

可解释的CNN

可解释的CNN

CreateAMind

17+阅读 · 2017年10月5日

【论文】图上的表示学习综述

【论文】图上的表示学习综述

机器学习研究会

15+阅读 · 2017年9月24日

最佳实践：深度学习用于自然语言处理（三）

最佳实践：深度学习用于自然语言处理（三）

待字闺中

3+阅读 · 2017年8月20日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

Explainable Recommender Systems via Resolving Learning Representations

Arxiv

13+阅读 · 2020年8月21日

Optimization and Generalization Analysis of Transduction through Gradient Boosting and Application to Multi-scale Graph Neural Networks

Arxiv

6+阅读 · 2020年6月15日

Generalization and Regularization in DQN

Generalization and Regularization in DQN

Arxiv

6+阅读 · 2019年1月30日

Learning From Positive and Unlabeled Data: A Survey

Learning From Positive and Unlabeled Data: A Survey

Arxiv

5+阅读 · 2018年11月12日

Super Characters: A Conversion from Sentiment Classification to Image Classification

Arxiv

4+阅读 · 2018年10月15日

Learning to Focus when Ranking Answers

Learning to Focus when Ranking Answers

Arxiv

5+阅读 · 2018年8月8日

High Performance Software in Multidimensional Reduction Methods for Image Processing with Application to Ancient Manuscripts

High Performance Software in Multidimensional Reduction Methods for Image Processing with Application to Ancient Manuscripts

Arxiv

4+阅读 · 2018年7月18日

Multimodal Sentiment Analysis To Explore the Structure of Emotions

Arxiv

19+阅读 · 2018年5月25日

An Iterative Spanning Forest Framework for Superpixel Segmentation

Arxiv

9+阅读 · 2018年1月30日

Constraint and Mathematical Programming Models for Integrated Port Container Terminal Operations

Arxiv

3+阅读 · 2017年12月14日

VIP会员

文章信息

相关主题

相关VIP内容

【普林斯顿经典书】高维概率，326页pdf，Probability in High Dimension

【普林斯顿经典书】高维概率，326页pdf，Probability in High Dimension

专知会员服务

106+阅读 · 2021年2月27日

【ETH】最新《几何数据分析》2020课程，附PPT下载

专知会员服务

45+阅读 · 2020年12月18日

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

2020数据工程师成长路线图

专知会员服务

19+阅读 · 2020年9月6日

(普林斯顿讲义)：高维概率论，326页pdf《Probability in High Dimension》

(普林斯顿讲义)：高维概率论，326页pdf《Probability in High Dimension》

专知会员服务

123+阅读 · 2020年5月30日

【斯坦福大学】面向可解释人工智能:神经网络的显著性检验（Towards Explainable AI: Significance Tests for Neural Networks），26页pdf

【斯坦福大学】面向可解释人工智能:神经网络的显著性检验（Towards Explainable AI: Significance Tests for Neural Networks），26页pdf

专知会员服务

27+阅读 · 2019年12月19日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【IJCAI 2019】细粒度的意见挖掘:当前趋势和前沿维度（Fine-grained Opinion Mining: Current Trend and Cutting-Edge Dimensions），虞剑飞

【IJCAI 2019】细粒度的意见挖掘:当前趋势和前沿维度（Fine-grained Opinion Mining: Current Trend and Cutting-Edge Dimensions），虞剑飞

专知会员服务

26+阅读 · 2019年8月11日

热门VIP内容

开通专知VIP会员享更多权益服务

检索增强生成（RAG）技术，261页slides

美联参会指南-联合规划与执行概述及政策框架 | 32页

从DeepSeek-R1学到的三个核心经验

大规模视觉模型中的提示式适配：综述

相关资讯

(普林斯顿讲义)：高维概率论，326页pdf《Probability in High Dimension》

(普林斯顿讲义)：高维概率论，326页pdf《Probability in High Dimension》

专知

21+阅读 · 2020年5月30日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【泡泡一分钟】学习紧密的几何特征（ICCV2017-17）

【泡泡一分钟】学习紧密的几何特征（ICCV2017-17）

泡泡机器人SLAM

20+阅读 · 2018年5月8日

笔记 | Sentiment Analysis

笔记 | Sentiment Analysis

黑龙江大学自然语言处理实验室

10+阅读 · 2018年5月6日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

可解释的CNN

可解释的CNN

CreateAMind

17+阅读 · 2017年10月5日

【论文】图上的表示学习综述

【论文】图上的表示学习综述

机器学习研究会

15+阅读 · 2017年9月24日

最佳实践：深度学习用于自然语言处理（三）

最佳实践：深度学习用于自然语言处理（三）

待字闺中

3+阅读 · 2017年8月20日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

相关论文

Explainable Recommender Systems via Resolving Learning Representations

Arxiv

13+阅读 · 2020年8月21日

Optimization and Generalization Analysis of Transduction through Gradient Boosting and Application to Multi-scale Graph Neural Networks

Arxiv

6+阅读 · 2020年6月15日

Generalization and Regularization in DQN

Generalization and Regularization in DQN

Arxiv

6+阅读 · 2019年1月30日

Learning From Positive and Unlabeled Data: A Survey

Learning From Positive and Unlabeled Data: A Survey

Arxiv

5+阅读 · 2018年11月12日

Super Characters: A Conversion from Sentiment Classification to Image Classification

Arxiv

4+阅读 · 2018年10月15日

Learning to Focus when Ranking Answers

Learning to Focus when Ranking Answers

Arxiv

5+阅读 · 2018年8月8日

High Performance Software in Multidimensional Reduction Methods for Image Processing with Application to Ancient Manuscripts

High Performance Software in Multidimensional Reduction Methods for Image Processing with Application to Ancient Manuscripts

Arxiv

4+阅读 · 2018年7月18日

Multimodal Sentiment Analysis To Explore the Structure of Emotions

Arxiv

19+阅读 · 2018年5月25日

An Iterative Spanning Forest Framework for Superpixel Segmentation

Arxiv

9+阅读 · 2018年1月30日

Constraint and Mathematical Programming Models for Integrated Port Container Terminal Operations

Arxiv

3+阅读 · 2017年12月14日

微信扫码咨询专知VIP会员