使用Set-估值数据测试核心假设 (Kernel Hypothesis Testing with Set-valued Data) - 专知论文

会员服务 ·

0

核化 · 相互独立的 · 特征提取 · 潜在 · 有偏 ·

2021 年 2 月 2 日

Kernel Hypothesis Testing with Set-valued Data

翻译：使用Set-估值数据测试核心假设

Alexis Bellot,Mihaela van der Schaar

We present a general framework for hypothesis testing on distributions of sets of individual examples. Sets may represent many common data sources such as groups of observations in time series, collections of words in text or a batch of images of a given phenomenon. This observation pattern, however, differs from the common assumptions required for hypothesis testing: each set differs in size, may have differing levels of noise, and also may incorporate nuisance variability, irrelevant for the analysis of the phenomenon of interest; all features that bias test decisions if not accounted for. In this paper, we propose to interpret sets as independent samples from a collection of latent probability distributions, and introduce kernel two-sample and independence tests in this latent space of distributions. We prove the consistency of tests and observe them to outperform in a wide range of synthetic experiments. Finally, we showcase their use in practice with experiments of healthcare and climate data, where previously heuristics were needed for feature extraction and testing.

翻译：我们为对各组个别实例的分布进行假设测试提供了一个总体框架。各组可能代表许多共同的数据来源,如时间序列观测组、文字文字文字收集或某一现象的一组图像。然而,这种观察模式不同于假设测试所要求的共同假设:每组不同大小,噪音程度不同,还可能包含骚扰性变化,与分析感兴趣的现象无关;偏见测试决定的所有特征,如果没有计算在内。我们提议将各组作为独立样本从潜在概率分布收集中解释,并在这一潜在分布空间中引入内核双模和独立测试。我们证明测试的一致性,并观察测试在广泛的合成实验中优于这些测试。最后,我们展示了这些测试在卫生和气候数据实验中的实用性,在特征提取和测试中以前需要超自然特征的实验中。

0

相关内容

【伯克利】机器学习蛋白质工程，Machine learning for protein engineering，83页ppt

【伯克利】机器学习蛋白质工程，Machine learning for protein engineering，83页ppt

专知会员服务

36+阅读 · 2020年5月9日

因果图，Causal Graphs，52页ppt

因果图，Causal Graphs，52页ppt

专知会员服务

253+阅读 · 2020年4月19日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【CHI2020-微软】解释可解释性:理解数据科学家使用机器学习的可解释性工具，Interpreting Interpretability: Understanding Data Scientists’Use of Interpretability Tools for Machine Learning

【CHI2020-微软】解释可解释性:理解数据科学家使用机器学习的可解释性工具，Interpreting Interpretability: Understanding Data Scientists’Use of Interpretability Tools for Machine Learning

专知会员服务

55+阅读 · 2020年3月8日

【KDD2019|讲座推荐】假设检验与统计声音模式挖掘：Hypothesis Testing and Statistically-sound Pattern Mining

【KDD2019|讲座推荐】假设检验与统计声音模式挖掘：Hypothesis Testing and Statistically-sound Pattern Mining

专知会员服务

22+阅读 · 2019年12月6日

【伯克利】机器学习诊断偏倚，Diagnosing bias with machine learning（附pdf链接）

【伯克利】机器学习诊断偏倚，Diagnosing bias with machine learning（附pdf链接）

专知会员服务

11+阅读 · 2019年11月30日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

163+阅读 · 2019年10月12日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

已删除

将门创投

9+阅读 · 2018年12月19日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

【SIGIR2018】五篇对抗训练文章

【SIGIR2018】五篇对抗训练文章

专知

12+阅读 · 2018年7月9日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

深度学习医学图像分析文献集

深度学习医学图像分析文献集

机器学习研究会

19+阅读 · 2017年10月13日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

Unsupervised Robust Domain Adaptation without Source Data

Arxiv

1+阅读 · 2021年3月26日

Epidemic change-point detection in general integer-valued time series

Epidemic change-point detection in general integer-valued time series

Arxiv

0+阅读 · 2021年3月26日

Variable Selection Using Nearest Neighbor Gaussian Processes

Arxiv

0+阅读 · 2021年3月26日

On Universality and Training in Binary Hypothesis Testing

Arxiv

0+阅读 · 2021年3月25日

Contrast to Divide: Self-Supervised Pre-Training for Learning with Noisy Labels

Arxiv

0+阅读 · 2021年3月25日

Statistical Integration of Heterogeneous Data with PO2PLS

Arxiv

0+阅读 · 2021年3月24日

Bayesian Matrix Completion for Hypothesis Testing

Arxiv

0+阅读 · 2021年3月24日

Note on the offspring distribution for group testing in the linear regime

Arxiv

0+阅读 · 2021年3月24日

Heavy-tailed distribution for combining dependent $p$-values with asymptotic robustness

Arxiv

0+阅读 · 2021年3月24日

No-Regret Learning Dynamics for Extensive-Form Correlated Equilibrium

Arxiv

4+阅读 · 2020年6月20日

VIP会员

文章信息

相关主题

相互独立的

相关VIP内容

【伯克利】机器学习蛋白质工程，Machine learning for protein engineering，83页ppt

【伯克利】机器学习蛋白质工程，Machine learning for protein engineering，83页ppt

专知会员服务

36+阅读 · 2020年5月9日

因果图，Causal Graphs，52页ppt

因果图，Causal Graphs，52页ppt

专知会员服务

253+阅读 · 2020年4月19日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【CHI2020-微软】解释可解释性:理解数据科学家使用机器学习的可解释性工具，Interpreting Interpretability: Understanding Data Scientists’Use of Interpretability Tools for Machine Learning

【CHI2020-微软】解释可解释性:理解数据科学家使用机器学习的可解释性工具，Interpreting Interpretability: Understanding Data Scientists’Use of Interpretability Tools for Machine Learning

专知会员服务

55+阅读 · 2020年3月8日

【KDD2019|讲座推荐】假设检验与统计声音模式挖掘：Hypothesis Testing and Statistically-sound Pattern Mining

【KDD2019|讲座推荐】假设检验与统计声音模式挖掘：Hypothesis Testing and Statistically-sound Pattern Mining

专知会员服务

22+阅读 · 2019年12月6日

【伯克利】机器学习诊断偏倚，Diagnosing bias with machine learning（附pdf链接）

【伯克利】机器学习诊断偏倚，Diagnosing bias with machine learning（附pdf链接）

专知会员服务

11+阅读 · 2019年11月30日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

163+阅读 · 2019年10月12日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《英国智库：瓦解俄罗斯防空系统生产，夺回制空权》最新报告

【伯克利博士论文】从推理服务到训练：面向大规模 LLM 智能体的高效系统

《战术突击工具包：军队的“边缘”操作系统》报告

《认知战的历史视角：从冷战心理战行动到AI驱动的信息战》最新报告

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

已删除

将门创投

9+阅读 · 2018年12月19日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

【SIGIR2018】五篇对抗训练文章

【SIGIR2018】五篇对抗训练文章

专知

12+阅读 · 2018年7月9日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

深度学习医学图像分析文献集

深度学习医学图像分析文献集

机器学习研究会

19+阅读 · 2017年10月13日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

相关论文

Unsupervised Robust Domain Adaptation without Source Data

Arxiv

1+阅读 · 2021年3月26日

Epidemic change-point detection in general integer-valued time series

Epidemic change-point detection in general integer-valued time series

Arxiv

0+阅读 · 2021年3月26日

Variable Selection Using Nearest Neighbor Gaussian Processes

Arxiv

0+阅读 · 2021年3月26日

On Universality and Training in Binary Hypothesis Testing

Arxiv

0+阅读 · 2021年3月25日

Contrast to Divide: Self-Supervised Pre-Training for Learning with Noisy Labels

Arxiv

0+阅读 · 2021年3月25日

Statistical Integration of Heterogeneous Data with PO2PLS

Arxiv

0+阅读 · 2021年3月24日

Bayesian Matrix Completion for Hypothesis Testing

Arxiv

0+阅读 · 2021年3月24日

Note on the offspring distribution for group testing in the linear regime

Arxiv

0+阅读 · 2021年3月24日

Heavy-tailed distribution for combining dependent $p$-values with asymptotic robustness

Arxiv

0+阅读 · 2021年3月24日

No-Regret Learning Dynamics for Extensive-Form Correlated Equilibrium

Arxiv

4+阅读 · 2020年6月20日

微信扫码咨询专知VIP会员