对抽样易交换性和特征独立性的灵活、非参数性测试 (Flexible non-parametric tests of sample exchangeability and feature independence) - 专知论文

会员服务 ·

0

可交换的 · 相互独立的 · Extensibility · 样本 · 矩阵论 ·

2022 年 6 月 23 日

Flexible non-parametric tests of sample exchangeability and feature independence

翻译：对抽样易交换性和特征独立性的灵活、非参数性测试

Alan J. Aw,Jeffrey P. Spence,Yun S. Song

from arxiv, Main Text: 25 pages Supplementary Material: 39 pages

In scientific studies involving analyses of multivariate data, two questions often arise for the researcher. First, is the sample exchangeable, meaning that the joint distribution of the sample is invariant to the ordering of the units? Second, are the features independent of one another, or can the features be grouped so that the groups are mutually independent? We propose a non-parametric approach that addresses these two questions. Our approach is conceptually simple, yet fast and flexible. It controls the Type I error across realistic scenarios, and handles data of arbitrary dimensions by leveraging large-sample asymptotics. In the exchangeability detection setting, through extensive simulations and a comparison against unsupervised tests of stratification based on random matrix theory, we find that our approach compares favorably in various scenarios of interest. We apply our method to problems in population and statistical genetics, including stratification detection and linkage disequilibrium splitting. We also consider other application domains, applying our approach to post-clustering single-cell chromatin accessibility data and World Values Survey data, where we show how users can partition features into independent groups, which helps generate new scientific hypotheses about the features.

翻译：在涉及多变量数据分析的科学研究中,研究人员经常会遇到两个问题。首先,抽样可交换,这意味着样本的共同分布与单位的顺序不同?第二,样本的共同分布与单位的顺序不同;第二,样本的特征相互独立,或者特征可以分组,以便小组相互独立;我们建议了一种非参数方法,以解决这两个问题。我们的方法在概念上简单,但又快又灵活。我们的方法在现实的情景中控制了类型I的错误,并且通过利用大型样本的设置处理任意尺寸的数据。在可交换性检测设置中,通过广泛的模拟和比较,与基于随机矩阵理论的未经监督的分层测试相比较,我们发现我们的方法在各种利益假设中比较优异。我们用我们的方法处理人口和统计遗传学方面的问题,包括分辨分辨和联系不均分。我们还考虑其他应用领域,运用我们的方法,利用后集单细胞的可获取性数据和世界价值调查数据来处理任意尺寸的数据。在可交换性检测中,我们通过广泛的模拟和比较方法,我们发现用户如何将分区特征分成独立的特性纳入独立的组别,从而产生新的科学模型。

0

相关内容

可交换的

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

93+阅读 · 2020年2月12日

【深度学习表格检测、信息提取和结构化】《Table Detection, Information Extraction and Structuring using Deep Learning》by Vihar Kurama

专知会员服务

38+阅读 · 2020年1月23日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

中国图象图形学学会CSIG

0+阅读 · 2021年12月17日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

中国图象图形学学会CSIG

0+阅读 · 2021年11月10日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

【ICIG2021】Latest News & Announcements of the Plenary Talk2

【ICIG2021】Latest News & Announcements of the Plenary Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年11月2日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Egr3调控造血干细胞功能的机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

多壁碳纳米管和典型环境污染物对食用植物的联合效应研究

国家自然科学基金

0+阅读 · 2013年12月31日

柴油机排气颗粒物与氮氧化物氧化反应的基础研究

国家自然科学基金

0+阅读 · 2012年12月31日

EGFR受体介导的肺肿瘤靶向siRNA类脂质载体的构建及转运调控机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

微/纳米结构导电聚合物/粘土复合材料的合成及其吸附重金属离子的构效关系

国家自然科学基金

0+阅读 · 2012年12月31日

TIP30核内化的分子机制及其与EGFR信号通路的相关性研究

国家自然科学基金

0+阅读 · 2012年12月31日

低温等离子体处理室内空气形成的二次污染物及其催化净化研究

国家自然科学基金

0+阅读 · 2011年12月31日

Narf影响细胞衰老的分子机制研究

国家自然科学基金

0+阅读 · 2009年12月31日

UGT基因簇进化及调控研究

国家自然科学基金

0+阅读 · 2009年12月31日

肿瘤细胞EGFR靶向的双功能免疫纳米胶束用于肿瘤MRI检测及药物治疗的研究

国家自然科学基金

0+阅读 · 2009年12月31日

Differentially Private Hypothesis Testing with the Subsampled and Aggregated Randomized Response Mechanism

Arxiv

0+阅读 · 2022年8月14日

Structure induced by a multiple membership transformation on the Conditional Autoregressive model

Arxiv

0+阅读 · 2022年8月13日

Dynamic Bayesian Learning and Calibration of Spatiotemporal Mechanistic System

Arxiv

0+阅读 · 2022年8月12日

Non-parametric regression models for compositional data

Arxiv

0+阅读 · 2022年8月12日

Clustering Optimisation Method for Highly Connected Biological Data

Arxiv

0+阅读 · 2022年8月11日

The Cost-Accuracy Trade-Off In Operator Learning With Neural Networks

Arxiv

0+阅读 · 2022年8月11日

Regressing Relative Fine-Grained Change for Sub-Groups in Unreliable Heterogeneous Data Through Deep Multi-Task Metric Learning

Regressing Relative Fine-Grained Change for Sub-Groups in Unreliable Heterogeneous Data Through Deep Multi-Task Metric Learning

Arxiv

0+阅读 · 2022年8月11日

Distributionally Robust Model-Based Offline Reinforcement Learning with Near-Optimal Sample Complexity

Arxiv

0+阅读 · 2022年8月11日

The Work of Art in an Age of Mechanical Generation

Arxiv

0+阅读 · 2022年8月10日

Natural Language Descriptions of Deep Visual Features

Arxiv

12+阅读 · 2022年1月26日

VIP会员

文章信息

相关主题

相互独立的

相关VIP内容

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

93+阅读 · 2020年2月12日

【深度学习表格检测、信息提取和结构化】《Table Detection, Information Extraction and Structuring using Deep Learning》by Vihar Kurama

专知会员服务

38+阅读 · 2020年1月23日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【ICCV2025教程】基础模型遇见具身智能体

军事机器学习设计：关于开发自动化任务摘要系统的梯次化设计科学研究 | 2025最新93页

扩散模型中的缓存方法综述：迈向高效的多模态生成

【ICCV2025教程】《迈向视觉语言模型的全面推理》

相关资讯

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

中国图象图形学学会CSIG

0+阅读 · 2021年12月17日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

中国图象图形学学会CSIG

0+阅读 · 2021年11月10日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

【ICIG2021】Latest News & Announcements of the Plenary Talk2

【ICIG2021】Latest News & Announcements of the Plenary Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年11月2日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

相关论文

Differentially Private Hypothesis Testing with the Subsampled and Aggregated Randomized Response Mechanism

Arxiv

0+阅读 · 2022年8月14日

Structure induced by a multiple membership transformation on the Conditional Autoregressive model

Arxiv

0+阅读 · 2022年8月13日

Dynamic Bayesian Learning and Calibration of Spatiotemporal Mechanistic System

Arxiv

0+阅读 · 2022年8月12日

Non-parametric regression models for compositional data

Arxiv

0+阅读 · 2022年8月12日

Clustering Optimisation Method for Highly Connected Biological Data

Arxiv

0+阅读 · 2022年8月11日

The Cost-Accuracy Trade-Off In Operator Learning With Neural Networks

Arxiv

0+阅读 · 2022年8月11日

Regressing Relative Fine-Grained Change for Sub-Groups in Unreliable Heterogeneous Data Through Deep Multi-Task Metric Learning

Regressing Relative Fine-Grained Change for Sub-Groups in Unreliable Heterogeneous Data Through Deep Multi-Task Metric Learning

Arxiv

0+阅读 · 2022年8月11日

Distributionally Robust Model-Based Offline Reinforcement Learning with Near-Optimal Sample Complexity

Arxiv

0+阅读 · 2022年8月11日

The Work of Art in an Age of Mechanical Generation

Arxiv

0+阅读 · 2022年8月10日

Natural Language Descriptions of Deep Visual Features

Arxiv

12+阅读 · 2022年1月26日

相关基金

Egr3调控造血干细胞功能的机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

多壁碳纳米管和典型环境污染物对食用植物的联合效应研究

国家自然科学基金

0+阅读 · 2013年12月31日

柴油机排气颗粒物与氮氧化物氧化反应的基础研究

国家自然科学基金

0+阅读 · 2012年12月31日

EGFR受体介导的肺肿瘤靶向siRNA类脂质载体的构建及转运调控机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

微/纳米结构导电聚合物/粘土复合材料的合成及其吸附重金属离子的构效关系

国家自然科学基金

0+阅读 · 2012年12月31日

TIP30核内化的分子机制及其与EGFR信号通路的相关性研究

国家自然科学基金

0+阅读 · 2012年12月31日

低温等离子体处理室内空气形成的二次污染物及其催化净化研究

国家自然科学基金

0+阅读 · 2011年12月31日

Narf影响细胞衰老的分子机制研究

国家自然科学基金

0+阅读 · 2009年12月31日

UGT基因簇进化及调控研究

国家自然科学基金

0+阅读 · 2009年12月31日

肿瘤细胞EGFR靶向的双功能免疫纳米胶束用于肿瘤MRI检测及药物治疗的研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员