用于混合数字和绝对数据的基于密度的可解释超立方区域分割 (Density-based interpretable hypercube region partitioning for mixed numeric and categorical data) - 专知论文

会员服务 ·

0

分类数据 · 划分 · 特征空间 · 稀疏 · 机器学习建模 ·

2021 年 10 月 11 日

Density-based interpretable hypercube region partitioning for mixed numeric and categorical data

翻译：用于混合数字和绝对数据的基于密度的可解释超立方区域分割

Samuel Ackerman,Eitan Farchi,Orna Raz,Marcel Zalmanovici,Maya Zohar

Consider a structured dataset of features, such as $\{\textrm{SEX}, \textrm{INCOME}, \textrm{RACE}, \textrm{EXPERIENCE}\}$. A user may want to know where in the feature space observations are concentrated, and where it is sparse or empty. The existence of large sparse or empty regions can provide domain knowledge of soft or hard feature constraints (e.g., what is the typical income range, or that it may be unlikely to have a high income with few years of work experience). Also, these can suggest to the user that machine learning (ML) model predictions for data inputs in sparse or empty regions may be unreliable. An interpretable region is a hyper-rectangle, such as $\{\textrm{RACE} \in\{\textrm{Black}, \textrm{White}\}\}\:\&$ $\{10 \leq \:\textrm{EXPERIENCE} \:\leq 13\}$, containing all observations satisfying the constraints; typically, such regions are defined by a small number of features. Our method constructs an observation density-based partition of the observed feature space in the dataset into such regions. It has a number of advantages over others in that it works on features of mixed type (numeric or categorical) in the original domain, and can separate out empty regions as well. As can be seen from visualizations, the resulting partitions accord with spatial groupings that a human eye might identify; the results should thus extend to higher dimensions. We also show some applications of the partition to other data analysis tasks, such as inferring about ML model error, measuring high-dimensional density variability, and causal inference for treatment effect. Many of these applications are made possible by the hyper-rectangular form of the partition regions.

翻译：考虑一个结构化的功能数据集, 例如 ${textrm{SEX},\ textrm{ INCOME},\ textrm{ RACE},\ textrm{ EXPERIEN} 。用户可能想知道在特性空间观测中哪里集中, 哪里是稀疏或空的。大量稀疏或空区域的存在可以提供软性或硬性特性限制的域知识( 例如, 典型的收入范围是什么, 或者它不太可能有高收入, 工作经历几年。此外, 这些可以向用户表明, 机器学习( ML) 用于在稀薄或空区域输入数据输入数据的模型预测值可能不可靠。这样的可解释区域, 例如 $\ textrm{ { { black} 、\ textrumrm{ { { { { { } { { { {dextleq\\ :\\\\ textrm=q 13} 13_\\\\\\\\\\\\\\\\\\\\\\\\\\\ la, 包含所有观测限制, 包含所有观察限制 ; ral deal real real real real real real deal deal deal dealessalessaless; mapsal deal deal made mads made made dis madeal deal deal deal ral ral deal deal deal deal deal deal ral deal deal mads mads mads mads mas mas 。

0

相关内容

分类数据

因果图，Causal Graphs，52页ppt

因果图，Causal Graphs，52页ppt

专知会员服务

253+阅读 · 2020年4月19日

【旷视-CVPR2020】领域自适应对象检测的探索类别正则化，Exploring Categorical Regularization for Domain Adaptive Object Detection

【旷视-CVPR2020】领域自适应对象检测的探索类别正则化，Exploring Categorical Regularization for Domain Adaptive Object Detection

专知会员服务

38+阅读 · 2020年3月23日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【CHI2020-微软】解释可解释性:理解数据科学家使用机器学习的可解释性工具，Interpreting Interpretability: Understanding Data Scientists’Use of Interpretability Tools for Machine Learning

【CHI2020-微软】解释可解释性:理解数据科学家使用机器学习的可解释性工具，Interpreting Interpretability: Understanding Data Scientists’Use of Interpretability Tools for Machine Learning

专知会员服务

55+阅读 · 2020年3月8日

【KDD2019|讲座推荐】从混乱中淘金：稀有类别的探索，展示，表示和解释：Gold Panning from the Mess: Rare Category Exploration, Exposition, Representation and Interpretation

【KDD2019|讲座推荐】从混乱中淘金：稀有类别的探索，展示，表示和解释：Gold Panning from the Mess: Rare Category Exploration, Exposition, Representation and Interpretation

专知会员服务

11+阅读 · 2019年12月14日

【ECML-PKDD 2019】带歧义的分类变量编码（Encoding Categorical Variables with Ambiguity）

【ECML-PKDD 2019】带歧义的分类变量编码（Encoding Categorical Variables with Ambiguity）

专知会员服务

5+阅读 · 2019年12月1日

【ECML-PKDD 2019】突破可解释性障碍——解释深度图卷积模型的一种方法（Breaking the interpretability barrier - a methodfor interpreting deep graph convolutional models）

【ECML-PKDD 2019】突破可解释性障碍——解释深度图卷积模型的一种方法（Breaking the interpretability barrier - a methodfor interpreting deep graph convolutional models）

专知会员服务

19+阅读 · 2019年12月1日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【论文笔记】通俗理解少样本文本分类 (Few-Shot Text Classification) (1)

【论文笔记】通俗理解少样本文本分类 (Few-Shot Text Classification) (1)

深度学习自然语言处理

7+阅读 · 2020年4月8日

分布式并行架构Ray介绍

分布式并行架构Ray介绍

CreateAMind

10+阅读 · 2019年8月9日

计算机 | 入门级EI会议ICVRIS 2019诚邀稿件

计算机 | 入门级EI会议ICVRIS 2019诚邀稿件

Call4Papers

10+阅读 · 2019年6月24日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

分布式TensorFlow入门指南

分布式TensorFlow入门指南

机器学习研究会

4+阅读 · 2017年11月28日

【学习】(Python)SVM数据分类

【学习】(Python)SVM数据分类

机器学习研究会

6+阅读 · 2017年10月15日

可解释的CNN

可解释的CNN

CreateAMind

17+阅读 · 2017年10月5日

【推荐】决策树/随机森林深入解析

【推荐】决策树/随机森林深入解析

机器学习研究会

5+阅读 · 2017年9月21日

【推荐】SVM实例教程

【推荐】SVM实例教程

机器学习研究会

17+阅读 · 2017年8月26日

Optimized variance estimation under interference and complex experimental designs

Arxiv

0+阅读 · 2021年12月3日

Improving the Reliability of Network Intrusion Detection Systems through Dataset Integration

Arxiv

0+阅读 · 2021年12月2日

Foundations of Symbolic Languages for Model Interpretability

Arxiv

7+阅读 · 2021年10月5日

Interpretable CNNs for Object Classification

Interpretable CNNs for Object Classification

Arxiv

20+阅读 · 2020年3月12日

DP-ADMM: ADMM-based Distributed Learning with Differential Privacy

Arxiv

3+阅读 · 2019年3月25日

Interpretable Active Learning

Interpretable Active Learning

Arxiv

3+阅读 · 2018年6月24日

Adaptive strategy for superpixel-based region-growing image segmentation

Arxiv

4+阅读 · 2018年3月17日

An Interpretable Reasoning Network for Multi-Relation Question Answering

Arxiv

17+阅读 · 2018年1月15日

Distributed Constraint Optimization Problems and Applications: A Survey

Arxiv

5+阅读 · 2018年1月11日

Active Learning from Positive and Unlabeled Data

Arxiv

3+阅读 · 2016年2月24日

VIP会员

文章信息

相关主题

机器学习建模

相关VIP内容

因果图，Causal Graphs，52页ppt

因果图，Causal Graphs，52页ppt

专知会员服务

253+阅读 · 2020年4月19日

【旷视-CVPR2020】领域自适应对象检测的探索类别正则化，Exploring Categorical Regularization for Domain Adaptive Object Detection

【旷视-CVPR2020】领域自适应对象检测的探索类别正则化，Exploring Categorical Regularization for Domain Adaptive Object Detection

专知会员服务

38+阅读 · 2020年3月23日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【CHI2020-微软】解释可解释性:理解数据科学家使用机器学习的可解释性工具，Interpreting Interpretability: Understanding Data Scientists’Use of Interpretability Tools for Machine Learning

【CHI2020-微软】解释可解释性:理解数据科学家使用机器学习的可解释性工具，Interpreting Interpretability: Understanding Data Scientists’Use of Interpretability Tools for Machine Learning

专知会员服务

55+阅读 · 2020年3月8日

【KDD2019|讲座推荐】从混乱中淘金：稀有类别的探索，展示，表示和解释：Gold Panning from the Mess: Rare Category Exploration, Exposition, Representation and Interpretation

【KDD2019|讲座推荐】从混乱中淘金：稀有类别的探索，展示，表示和解释：Gold Panning from the Mess: Rare Category Exploration, Exposition, Representation and Interpretation

专知会员服务

11+阅读 · 2019年12月14日

【ECML-PKDD 2019】带歧义的分类变量编码（Encoding Categorical Variables with Ambiguity）

【ECML-PKDD 2019】带歧义的分类变量编码（Encoding Categorical Variables with Ambiguity）

专知会员服务

5+阅读 · 2019年12月1日

【ECML-PKDD 2019】突破可解释性障碍——解释深度图卷积模型的一种方法（Breaking the interpretability barrier - a methodfor interpreting deep graph convolutional models）

【ECML-PKDD 2019】突破可解释性障碍——解释深度图卷积模型的一种方法（Breaking the interpretability barrier - a methodfor interpreting deep graph convolutional models）

专知会员服务

19+阅读 · 2019年12月1日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

大语言模型智能体强化学习：全景综述

《城市滨海地区：理解复杂多变环境下的指挥控制框架》50页报告

【伯克利博士论文】从推理服务到训练：面向大规模 LLM 智能体的高效系统

美空军“顶点2025”实验：推进AI在C2、动态目标锁定与联盟集成中的应用

相关资讯

【论文笔记】通俗理解少样本文本分类 (Few-Shot Text Classification) (1)

【论文笔记】通俗理解少样本文本分类 (Few-Shot Text Classification) (1)

深度学习自然语言处理

7+阅读 · 2020年4月8日

分布式并行架构Ray介绍

分布式并行架构Ray介绍

CreateAMind

10+阅读 · 2019年8月9日

计算机 | 入门级EI会议ICVRIS 2019诚邀稿件

计算机 | 入门级EI会议ICVRIS 2019诚邀稿件

Call4Papers

10+阅读 · 2019年6月24日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

分布式TensorFlow入门指南

分布式TensorFlow入门指南

机器学习研究会

4+阅读 · 2017年11月28日

【学习】(Python)SVM数据分类

【学习】(Python)SVM数据分类

机器学习研究会

6+阅读 · 2017年10月15日

可解释的CNN

可解释的CNN

CreateAMind

17+阅读 · 2017年10月5日

【推荐】决策树/随机森林深入解析

【推荐】决策树/随机森林深入解析

机器学习研究会

5+阅读 · 2017年9月21日

【推荐】SVM实例教程

【推荐】SVM实例教程

机器学习研究会

17+阅读 · 2017年8月26日

相关论文

Optimized variance estimation under interference and complex experimental designs

Arxiv

0+阅读 · 2021年12月3日

Improving the Reliability of Network Intrusion Detection Systems through Dataset Integration

Arxiv

0+阅读 · 2021年12月2日

Foundations of Symbolic Languages for Model Interpretability

Arxiv

7+阅读 · 2021年10月5日

Interpretable CNNs for Object Classification

Interpretable CNNs for Object Classification

Arxiv

20+阅读 · 2020年3月12日

DP-ADMM: ADMM-based Distributed Learning with Differential Privacy

Arxiv

3+阅读 · 2019年3月25日

Interpretable Active Learning

Interpretable Active Learning

Arxiv

3+阅读 · 2018年6月24日

Adaptive strategy for superpixel-based region-growing image segmentation

Arxiv

4+阅读 · 2018年3月17日

An Interpretable Reasoning Network for Multi-Relation Question Answering

Arxiv

17+阅读 · 2018年1月15日

Distributed Constraint Optimization Problems and Applications: A Survey

Arxiv

5+阅读 · 2018年1月11日

Active Learning from Positive and Unlabeled Data

Arxiv

3+阅读 · 2016年2月24日

微信扫码咨询专知VIP会员