The additive model is a popular nonparametric regression method due to its ability to retain modeling flexibility while avoiding the curse of dimensionality. The backfitting algorithm is an intuitive and widely used numerical approach for fitting additive models. However, its application to large datasets may incur a high computational cost and is thus infeasible in practice. To address this problem, we propose a novel approach called independence-encouraging subsampling (IES) to select a subsample from big data for training additive models. Inspired by the minimax optimality of an orthogonal array (OA) due to its pairwise independent predictors and uniform coverage for the range of each predictor, the IES approach selects a subsample that approximates an OA to achieve the minimax optimality. Our asymptotic analyses demonstrate that an IES subsample converges to an OA and that the backfitting algorithm over the subsample converges to a unique solution even if the predictors are highly dependent in the original big data. The proposed IES method is also shown to be numerically appealing via simulations and a real data application.
翻译:添加模型是一种流行的非参数回归法,因为它能够保留模型灵活性,同时避免维度的诅咒。回装算法是一种直观的、广泛使用的对适配添加模型采用的数字方法。然而,它在大型数据集中的应用可能会产生很高的计算成本,因此在实践中是行不通的。为了解决这一问题,我们提议了一种新颖的方法,即独立-鼓励亚抽样(IES),从大数据中选择一个子样本,用于培训添加模型。受一个正方形阵列(OA)微缩最大优化的启发,因为其对称独立预测器和每个预测器范围的统一覆盖,IES方法选择了一个接近OA的子样本,以实现微缩成最佳性。我们进行的微量分析表明,IES子抽样抽样抽样分析与OA相融合,而亚相配对的算法也与一个独特的解决方案相融合,即使预报器在原始大数据中高度依赖。提议的IES方法也显示,通过模拟和真实的数据应用具有数字性。</s>