可缩放子抽样:计算、汇总和推断 (Scalable subsampling: computation, aggregation and inference) - 专知论文

会员服务 ·

0

子采样 · 估计/估计量 · 推断 · 统计量 · 置信度 ·

2021 年 12 月 13 日

Scalable subsampling: computation, aggregation and inference

翻译：可缩放子抽样:计算、汇总和推断

Dimitris N. Politis

Subsampling is a general statistical method developed in the 1990s aimed at estimating the sampling distribution of a statistic $\hat \theta _n$ in order to conduct nonparametric inference such as the construction of confidence intervals and hypothesis tests. Subsampling has seen a resurgence in the Big Data era where the standard, full-resample size bootstrap can be infeasible to compute. Nevertheless, even choosing a single random subsample of size $b$ can be computationally challenging with both $b$ and the sample size $n$ being very large. In the paper at hand, we show how a set of appropriately chosen, non-random subsamples can be used to conduct effective -- and computationally feasible -- distribution estimation via subsampling. Further, we show how the same set of subsamples can be used to yield a procedure for subsampling aggregation -- also known as subagging -- that is scalable with big data. Interestingly, the scalable subagging estimator can be tuned to have the same (or better) rate of convergence as compared to $\hat \theta _n$. The paper is concluded by showing how to conduct inference, e.g., confidence intervals, based on the scalable subagging estimator instead of the original $\hat \theta _n$.

翻译：子抽样是1990年代开发的一种一般性统计方法,旨在估计一个统计的抽样分布情况,即$\hhat\theta_n$,以进行非参数性推断,例如构建信任间隔和假设测试。在大数据时代,在标准、全反射大小的靴子陷阱可能无法进行计算的情况下,子抽样看到重新出现。尽管如此,即使选择一个规模为$b$的随机子样本,也可以在计算上具有挑战性,因为美元和样本规模为$非常大。在手头的纸张中,我们展示了如何使用一套适当选择的非随机子样本来进行有效的 -- -- 和计算上可行的 -- -- 通过子抽样抽样进行分配估计。此外,我们展示了如何使用同一组子样本来产生一个与大数据相适应的子抽样程序。有趣的是,可测量的子缩放比例可以调整为相同的(或更精确)美元。在原始的纸张排序中,以美元为折叠的缩放比例,以美元为折叠。

0

相关内容

子采样

清华发布《国际科技创新中心指数2021》，54页pdf

专知会员服务

38+阅读 · 2021年10月13日

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

【WSDM2021】拓扑去噪的鲁棒图神经网络

【WSDM2021】拓扑去噪的鲁棒图神经网络

专知会员服务

27+阅读 · 2020年11月14日

【干货书】机器学习Primer，122页pdf

【干货书】机器学习Primer，122页pdf

专知会员服务

109+阅读 · 2020年10月5日

【KDD2020-Tutorial】因果推理与稳定学习，Causal Inference and Stable Learning

【KDD2020-Tutorial】因果推理与稳定学习，Causal Inference and Stable Learning

专知会员服务

87+阅读 · 2020年8月28日

【快讯】ICML 2020论文出炉，1088篇上榜，你的paper中了吗？

【快讯】ICML 2020论文出炉，1088篇上榜，你的paper中了吗？

专知会员服务

52+阅读 · 2020年6月1日

【ICML2019 tutorial】因果推理和稳定学习（Causal Inference and Stable Learning）

【ICML2019 tutorial】因果推理和稳定学习（Causal Inference and Stable Learning）

专知会员服务

175+阅读 · 2019年12月7日

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

10+阅读 · 2019年10月24日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

人工智能 | NIPS 2019等国际会议信息8条

人工智能 | NIPS 2019等国际会议信息8条

Call4Papers

7+阅读 · 2019年3月21日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

人工智能 | 国际会议信息6条

人工智能 | 国际会议信息6条

Call4Papers

5+阅读 · 2019年1月4日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

人工智能 | AAAI 2019等国际会议信息7条

人工智能 | AAAI 2019等国际会议信息7条

Call4Papers

5+阅读 · 2018年9月3日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

人工智能 | 国际会议截稿信息9条

人工智能 | 国际会议截稿信息9条

Call4Papers

4+阅读 · 2018年3月13日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

Scalable Spatiotemporally Varying Coefficient Modelling with Bayesian Kernelized Tensor Regression

Scalable Spatiotemporally Varying Coefficient Modelling with Bayesian Kernelized Tensor Regression

Arxiv

0+阅读 · 2022年2月15日

Stochastic Multi-level Composition Optimization Algorithms with Level-Independent Convergence Rates

Arxiv

0+阅读 · 2022年2月14日

Approximate Inference via Clustering

Arxiv

0+阅读 · 2022年2月14日

On Generalisation of Isotropic Central Difference for Higher Order Approximation of Fractional Laplacian

Arxiv

0+阅读 · 2022年2月13日

Improve Deep Image Inpainting by Emphasizing the Complexity of Missing Regions

Arxiv

0+阅读 · 2022年2月13日

Adaptive Regret for Control of Time-Varying Dynamics

Arxiv

0+阅读 · 2022年2月12日

Demystifying Why Local Aggregation Helps: Convergence Analysis of Hierarchical SGD

Arxiv

0+阅读 · 2022年2月11日

Sticky PDMP samplers for sparse and local inference problems

Sticky PDMP samplers for sparse and local inference problems

Arxiv

0+阅读 · 2022年2月11日

On the computation of Gröbner bases for pluriweighted-homogeneous systems

Arxiv

0+阅读 · 2022年2月11日

Inference for Projection-Based Wasserstein Distances on Finite Spaces

Arxiv

0+阅读 · 2022年2月11日

VIP会员

文章信息

相关主题

估计/估计量

相关VIP内容

清华发布《国际科技创新中心指数2021》，54页pdf

专知会员服务

38+阅读 · 2021年10月13日

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

【WSDM2021】拓扑去噪的鲁棒图神经网络

【WSDM2021】拓扑去噪的鲁棒图神经网络

专知会员服务

27+阅读 · 2020年11月14日

【干货书】机器学习Primer，122页pdf

【干货书】机器学习Primer，122页pdf

专知会员服务

109+阅读 · 2020年10月5日

【KDD2020-Tutorial】因果推理与稳定学习，Causal Inference and Stable Learning

【KDD2020-Tutorial】因果推理与稳定学习，Causal Inference and Stable Learning

专知会员服务

87+阅读 · 2020年8月28日

【快讯】ICML 2020论文出炉，1088篇上榜，你的paper中了吗？

【快讯】ICML 2020论文出炉，1088篇上榜，你的paper中了吗？

专知会员服务

52+阅读 · 2020年6月1日

【ICML2019 tutorial】因果推理和稳定学习（Causal Inference and Stable Learning）

【ICML2019 tutorial】因果推理和稳定学习（Causal Inference and Stable Learning）

专知会员服务

175+阅读 · 2019年12月7日

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

10+阅读 · 2019年10月24日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

热门VIP内容

开通专知VIP会员享更多权益服务

《美空军条令出版物：战略打击》最新条令

《高能激光武器》22页slides

军事前沿模型

《面向小型无人机或无人飞行器的创新雷达探测与人工智能分类技术》263页

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

人工智能 | NIPS 2019等国际会议信息8条

人工智能 | NIPS 2019等国际会议信息8条

Call4Papers

7+阅读 · 2019年3月21日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

人工智能 | 国际会议信息6条

人工智能 | 国际会议信息6条

Call4Papers

5+阅读 · 2019年1月4日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

人工智能 | AAAI 2019等国际会议信息7条

人工智能 | AAAI 2019等国际会议信息7条

Call4Papers

5+阅读 · 2018年9月3日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

人工智能 | 国际会议截稿信息9条

人工智能 | 国际会议截稿信息9条

Call4Papers

4+阅读 · 2018年3月13日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

相关论文

Scalable Spatiotemporally Varying Coefficient Modelling with Bayesian Kernelized Tensor Regression

Scalable Spatiotemporally Varying Coefficient Modelling with Bayesian Kernelized Tensor Regression

Arxiv

0+阅读 · 2022年2月15日

Stochastic Multi-level Composition Optimization Algorithms with Level-Independent Convergence Rates

Arxiv

0+阅读 · 2022年2月14日

Approximate Inference via Clustering

Arxiv

0+阅读 · 2022年2月14日

On Generalisation of Isotropic Central Difference for Higher Order Approximation of Fractional Laplacian

Arxiv

0+阅读 · 2022年2月13日

Improve Deep Image Inpainting by Emphasizing the Complexity of Missing Regions

Arxiv

0+阅读 · 2022年2月13日

Adaptive Regret for Control of Time-Varying Dynamics

Arxiv

0+阅读 · 2022年2月12日

Demystifying Why Local Aggregation Helps: Convergence Analysis of Hierarchical SGD

Arxiv

0+阅读 · 2022年2月11日

Sticky PDMP samplers for sparse and local inference problems

Sticky PDMP samplers for sparse and local inference problems

Arxiv

0+阅读 · 2022年2月11日

On the computation of Gröbner bases for pluriweighted-homogeneous systems

Arxiv

0+阅读 · 2022年2月11日

Inference for Projection-Based Wasserstein Distances on Finite Spaces

Arxiv

0+阅读 · 2022年2月11日

微信扫码咨询专知VIP会员