Improvements in technology lead to increasing availability of large data sets which makes the need for data reduction and informative subsamples ever more important. In this paper we construct $ D $-optimal subsampling designs for polynomial regression in one covariate for invariant distributions of the covariate. We study quadratic regression more closely for specific distributions. In particular we make statements on the shape of the resulting optimal subsampling designs and the effect of the subsample size on the design. To illustrate the advantage of the optimal subsampling designs we examine the efficiency of uniform random subsampling.
翻译:技术的改进导致大型数据集的可获取性增加,这使得对数据减少和资料性子样本的需求变得更加重要。在本文件中,我们为一个共变式分布的多元回归共变量中建造了D$-最佳亚抽样设计。我们更仔细地研究特定分布的二次回归。我们特别就由此产生的最佳次抽样设计形状和子抽样规模对设计的影响做了说明。为了说明最佳次抽样设计的好处,我们研究了统一随机子抽样的效率。