The bootstrap is a popular data-driven method to quantify statistical uncertainty, but for modern high-dimensional problems, it could suffer from huge computational costs due to the need to repeatedly generate resamples and refit models. We study the use of bootstraps in high-dimensional environments with a small number of resamples. In particular, we show that by using sample-resample independence from a recent "cheap" bootstrap perspective, running a number of resamples as small as one could attain valid coverage even when the dimension grows closely with the sample size, thus supporting the implementability of the bootstrap for large-scale problems. We validate our theoretical results and compare the performance of our approach with other benchmarks via a range of experiments.
翻译:低计算量高维条件下的自助法
Translated Abstract:
本文研究了在现代高维问题中使用低数量自助法来量化统计不确定性的方法。自助法是一种流行的数据驱动方法,但在现代高维问题中,由于需要重复生成重采样和重新拟合模型,可能会遭受巨大的计算成本。我们研究了在少量重采样的高维环境中使用自助法的方法。特别是,我们展示了如何利用最近“低成本”自助法角度的样本-重采样独立性,即使在维度与样本量密切增长时,运行只有一个重采样的数量也可以获得有效覆盖,支持自助法在大规模问题上的可行性。我们通过一系列实验验证了理论结果,并将我们的方法性能与其他基准进行了比较。