Computationally efficient permutation tests for the multivariate two-sample problem based on energy distance or maximum mean discrepancy statistics - 专知论文

会员服务 ·

0

统计量 · 最大平均偏差 · 欧几里得距离 · 计算成本 · 均值 ·

Computationally efficient permutation tests for the multivariate two-sample problem based on energy distance or maximum mean discrepancy statistics

翻译：暂无翻译

Elias Chaibub Neto

from arxiv, 22 pages, 3 figures

Non-parametric two-sample tests based on energy distance or maximum mean discrepancy are widely used statistical tests for comparing multivariate data from two populations. While these tests enjoy desirable statistical properties, their test statistics can be expensive to compute as they require the computation of 3 distinct Euclidean distance (or kernel) matrices between samples, where the time complexity of each of these computations (namely, $O(n_{x}^2 p)$, $O(n_{y}^2 p)$, and $O(n_{x} n_{y} p)$) scales quadratically with the number of samples ($n_x$, $n_y$) and linearly with the number of variables ($p$). Since the standard permutation test requires repeated re-computations of these expensive statistics it's application to large datasets can become unfeasible. While several statistical approaches have been proposed to mitigate this issue, they all sacrifice desirable statistical properties to decrease the computational cost (e.g., trade computation speed by a decrease in statistical power). A better computational strategy is to first pre-compute the Euclidean distance (kernel) matrix of the concatenated data, and then permute indexes and retrieve the corresponding elements to compute the re-sampled statistics. While this strategy can reduce the computation cost relative to the standard permutation test, it relies on the computation of a larger Euclidean distance (kernel) matrix with complexity $O((n_x + n_y)^2 p)$. In this paper, we present a novel computationally efficient permutation algorithm which only requires the pre-computation of the 3 smaller matrices and achieves large computational speedups without sacrificing finite-sample validity or statistical power. We illustrate its computational gains in a series of experiments and compare its statistical power to the current state-of-the-art approach for balancing computational cost and statistical performance.

翻译：暂无翻译

0

相关内容

统计量

分布外泛化(Out-Of-Distribution Generalization) 综述论文，22页pdf240篇文献

专知会员服务

63+阅读 · 2021年9月2日

【ACL2020】多模态信息抽取，365页ppt

【ACL2020】多模态信息抽取，365页ppt

专知会员服务

141+阅读 · 2020年7月6日

【TPAMI2020】目标检测中的不平衡问题:综述论文，34页pdf

专知会员服务

53+阅读 · 2020年3月16日

【AI应用】Facebook-利用神经网络求解高等数学方程, Using neural networks to solve advanced mathematics equations

【AI应用】Facebook-利用神经网络求解高等数学方程, Using neural networks to solve advanced mathematics equations

专知会员服务

33+阅读 · 2020年1月15日

【WSDM2020】超越统计关系：将知识关系整合到多标签音乐风格分类的风格关联中（附pdf）

专知会员服务

16+阅读 · 2019年11月23日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

25+阅读 · 2019年10月18日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

32+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

145+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

171+阅读 · 2019年10月11日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

39+阅读 · 2019年10月9日

RL解决'BipedalWalkerHardcore-v2' (SOTA)

RL解决'BipedalWalkerHardcore-v2' (SOTA)

CreateAMind

31+阅读 · 2019年7月17日

Github项目推荐 | 股市预测的机器学习/深度学习模型/资源集锦

Github项目推荐 | 股市预测的机器学习/深度学习模型/资源集锦

AI研习社

32+阅读 · 2019年4月18日

Github项目推荐 | 语义分割、实例分割、全景分割和视频分割的论文和基准列表

Github项目推荐 | 语义分割、实例分割、全景分割和视频分割的论文和基准列表

AI研习社

31+阅读 · 2019年4月5日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

17+阅读 · 2019年1月7日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

16+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

论文浅尝 | 嵌入常识知识的注意力 LSTM 模型用于特定目标的基于侧面的情感分析

论文浅尝 | 嵌入常识知识的注意力 LSTM 模型用于特定目标的基于侧面的情感分析

开放知识图谱

28+阅读 · 2018年6月11日

Focal Loss for Dense Object Detection

Focal Loss for Dense Object Detection

统计学习与视觉计算组

11+阅读 · 2018年3月15日

基于位置注意力机制模型和带标签数据来提升槽填充（EMNLP outstanding paper）

基于位置注意力机制模型和带标签数据来提升槽填充（EMNLP outstanding paper）

科技创新与创业

17+阅读 · 2017年11月17日

基于Amalgam空间的Hardy空间实变理论及其应用

国家自然科学基金

1+阅读 · 2017年12月31日

Musielak-Orlicz-Sobolev 空间中的迹嵌入及其应用

国家自然科学基金

1+阅读 · 2015年12月31日

三类多尺度问题的多尺度算法

国家自然科学基金

1+阅读 · 2015年12月31日

受Mittag-Lef？er噪声激励的广义朗之万方程的随机共振研究

国家自然科学基金

0+阅读 · 2015年12月31日

Schr？dinger-Poisson方程守恒DDG方法研究

国家自然科学基金

0+阅读 · 2015年12月31日

随机二阶锥互补问题理论与算法研究及其应用

国家自然科学基金

0+阅读 · 2015年12月31日

基于决策模型和预备电位的运动想象BCI研究

国家自然科学基金

1+阅读 · 2015年12月31日

哈密尔顿系统及多体问题的周期解

国家自然科学基金

0+阅读 · 2014年12月31日

动态Gr？bner 基与GVW算法

国家自然科学基金

0+阅读 · 2014年12月31日

概率抽样设计及其统计推断方法

国家自然科学基金

4+阅读 · 2014年12月31日

A moment approach for entropy solutions of parameter-dependent hyperbolic conservation laws

Arxiv

0+阅读 · 6月24日

Weak recovery, hypothesis testing, and mutual information in stochastic block models and planted factor graphs

Arxiv

0+阅读 · 6月22日

Conditional score-based diffusion models for solving inverse problems in mechanics

Arxiv

0+阅读 · 6月21日

Sharp detection of low-dimensional structure in probability measures via dimensional logarithmic Sobolev inequalities

Arxiv

0+阅读 · 6月21日

The neural correlates of logical-mathematical symbol systems processing resemble that of spatial cognition more than natural language processing

Arxiv

0+阅读 · 6月20日

On one-step numerical schemes of weak convergence for SDEs with super-linear coefficients

Arxiv

0+阅读 · 6月20日

Neural Bayes estimators for censored inference with peaks-over-threshold models

Arxiv

0+阅读 · 6月18日

A posteriori error estimation for an interior penalty virtual element method of Kirchhoff plates

Arxiv

0+阅读 · 6月17日

Stein's method of moments for truncated multivariate distributions

Arxiv

0+阅读 · 6月15日

Polytopal mesh agglomeration via geometrical deep learning for three-dimensional heterogeneous domains

Arxiv

0+阅读 · 6月15日

VIP会员

文章信息

相关主题

最大平均偏差

欧几里得距离

相关VIP内容

分布外泛化(Out-Of-Distribution Generalization) 综述论文，22页pdf240篇文献

专知会员服务

63+阅读 · 2021年9月2日

【ACL2020】多模态信息抽取，365页ppt

【ACL2020】多模态信息抽取，365页ppt

专知会员服务

141+阅读 · 2020年7月6日

【TPAMI2020】目标检测中的不平衡问题:综述论文，34页pdf

专知会员服务

53+阅读 · 2020年3月16日

【AI应用】Facebook-利用神经网络求解高等数学方程, Using neural networks to solve advanced mathematics equations

【AI应用】Facebook-利用神经网络求解高等数学方程, Using neural networks to solve advanced mathematics equations

专知会员服务

33+阅读 · 2020年1月15日

【WSDM2020】超越统计关系：将知识关系整合到多标签音乐风格分类的风格关联中（附pdf）

专知会员服务

16+阅读 · 2019年11月23日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

25+阅读 · 2019年10月18日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

32+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

145+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

171+阅读 · 2019年10月11日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

39+阅读 · 2019年10月9日

热门VIP内容

相关资讯

RL解决'BipedalWalkerHardcore-v2' (SOTA)

RL解决'BipedalWalkerHardcore-v2' (SOTA)

CreateAMind

31+阅读 · 2019年7月17日

Github项目推荐 | 股市预测的机器学习/深度学习模型/资源集锦

Github项目推荐 | 股市预测的机器学习/深度学习模型/资源集锦

AI研习社

32+阅读 · 2019年4月18日

Github项目推荐 | 语义分割、实例分割、全景分割和视频分割的论文和基准列表

Github项目推荐 | 语义分割、实例分割、全景分割和视频分割的论文和基准列表

AI研习社

31+阅读 · 2019年4月5日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

17+阅读 · 2019年1月7日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

16+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

论文浅尝 | 嵌入常识知识的注意力 LSTM 模型用于特定目标的基于侧面的情感分析

论文浅尝 | 嵌入常识知识的注意力 LSTM 模型用于特定目标的基于侧面的情感分析

开放知识图谱

28+阅读 · 2018年6月11日

Focal Loss for Dense Object Detection

Focal Loss for Dense Object Detection

统计学习与视觉计算组

11+阅读 · 2018年3月15日

基于位置注意力机制模型和带标签数据来提升槽填充（EMNLP outstanding paper）

基于位置注意力机制模型和带标签数据来提升槽填充（EMNLP outstanding paper）

科技创新与创业

17+阅读 · 2017年11月17日

相关论文

A moment approach for entropy solutions of parameter-dependent hyperbolic conservation laws

Arxiv

0+阅读 · 6月24日

Weak recovery, hypothesis testing, and mutual information in stochastic block models and planted factor graphs

Arxiv

0+阅读 · 6月22日

Conditional score-based diffusion models for solving inverse problems in mechanics

Arxiv

0+阅读 · 6月21日

Sharp detection of low-dimensional structure in probability measures via dimensional logarithmic Sobolev inequalities

Arxiv

0+阅读 · 6月21日

The neural correlates of logical-mathematical symbol systems processing resemble that of spatial cognition more than natural language processing

Arxiv

0+阅读 · 6月20日

On one-step numerical schemes of weak convergence for SDEs with super-linear coefficients

Arxiv

0+阅读 · 6月20日

Neural Bayes estimators for censored inference with peaks-over-threshold models

Arxiv

0+阅读 · 6月18日

A posteriori error estimation for an interior penalty virtual element method of Kirchhoff plates

Arxiv

0+阅读 · 6月17日

Stein's method of moments for truncated multivariate distributions

Arxiv

0+阅读 · 6月15日

Polytopal mesh agglomeration via geometrical deep learning for three-dimensional heterogeneous domains

Arxiv

0+阅读 · 6月15日

相关基金

基于Amalgam空间的Hardy空间实变理论及其应用

国家自然科学基金

1+阅读 · 2017年12月31日

Musielak-Orlicz-Sobolev 空间中的迹嵌入及其应用

国家自然科学基金

1+阅读 · 2015年12月31日

三类多尺度问题的多尺度算法

国家自然科学基金

1+阅读 · 2015年12月31日

受Mittag-Lef？er噪声激励的广义朗之万方程的随机共振研究

国家自然科学基金

0+阅读 · 2015年12月31日

Schr？dinger-Poisson方程守恒DDG方法研究

国家自然科学基金

0+阅读 · 2015年12月31日

随机二阶锥互补问题理论与算法研究及其应用

国家自然科学基金

0+阅读 · 2015年12月31日

基于决策模型和预备电位的运动想象BCI研究

国家自然科学基金

1+阅读 · 2015年12月31日

哈密尔顿系统及多体问题的周期解

国家自然科学基金

0+阅读 · 2014年12月31日

动态Gr？bner 基与GVW算法

国家自然科学基金

0+阅读 · 2014年12月31日

概率抽样设计及其统计推断方法

国家自然科学基金

4+阅读 · 2014年12月31日

微信扫码咨询专知VIP会员