Censored quantile regression (CQR) has become a valuable tool to study the heterogeneous association between a possibly censored outcome and a set of covariates, yet computation and statistical inference for CQR have remained a challenge for large-scale data with many covariates. In this paper, we focus on a smoothed martingale-based sequential estimating equations approach, to which scalable gradient-based algorithms can be applied. Theoretically, we provide a unified analysis of the smoothed sequential estimator and its penalized counterpart in increasing dimensions. When the covariate dimension grows with the sample size at a sublinear rate, we establish the uniform convergence rate (over a range of quantile indexes) and provide a rigorous justification for the validity of a multiplier bootstrap procedure for inference. In high-dimensional sparse settings, our results considerably improve the existing work on CQR by relaxing an exponential term of sparsity. We also demonstrate the advantage of the smoothed CQR over existing methods with both simulated experiments and data applications.
翻译:二次曲线回归(CQR)已成为一项有价值的工具,用于研究可能审查的结果与一组共变结果之间的各种关联,但计算和统计推算对于许多共变数据而言,对于大型数据来说,CQR仍然是一项挑战。在本文中,我们侧重于一个平滑的马丁格尔序列估算方程方法,可以对之适用可缩放梯度的算法。理论上,我们提供了对平滑的连续测序器及其受罚对应方在增加维度方面的统一分析。当共变维度随着抽样规模以子线性速度增长时,我们建立了统一的趋同率(相对于一系列四分位指数),并为推论的倍增倍式测轨程序的有效性提供了严格的理由。在高维稀薄环境中,我们的成果通过放松一个指数性弹性术语,大大改进了CQR的现有工作。我们还通过模拟实验和数据应用,展示了平滑的CQR优于现有方法的优势。