Subsampling is an efficient method to deal with massive data. In this paper, we investigate the optimal subsampling for linear quantile regression when the covariates are functions. The asymptotic distribution of the subsampling estimator is first derived. Then, we obtain the optimal subsampling probabilities based on the A-optimality criterion. Furthermore, the modified subsampling probabilities without estimating the densities of the response variables given the covariates are also proposed, which are easier to implement in practise. Numerical experiments on synthetic and real data show that the proposed methods always outperform the one with uniform sampling and can approximate the results based on full data well with less computational efforts.
翻译:子抽样是处理大量数据的有效方法。 在本文中, 我们调查当共变量是函数时, 线性微量回归的最佳次抽样。 最初得出了子抽样估计值的无症状分布。 然后, 我们根据A- 最佳度标准获得了最佳次抽样概率。 此外, 也提出了修改的子抽样概率, 但没有估计共变量中响应变量的密度, 而这些变量在实践上比较容易执行。 合成和真实数据的数值实验显示, 拟议的方法总是以统一的抽样方式优于该方法, 并且可以将基于完整数据的结果与较少的计算努力相近。