多元回归深度中位数有限样本破裂点 (Finite sample breakdown point of multivariate regression depth median)

The maximum depth estimator (aka depth median) ($\bs{\beta}^*_{RD}$) induced from regression depth (RD) of Rousseeuw and Hubert (1999) (RH99) is one of the most prevailing estimators in regression. It possesses outstanding robustness similar to the univariate location counterpart. Indeed, $\bs{\beta}^*_{RD}$ can, asymptotically, resist up to $33\%$ contamination without breakdown, in contrast to the $0\%$ for the traditional (least squares and least absolute deviations) estimators (see Van Aelst and Rousseeuw, 2000) (VAR00)). The results from VAR00 are pioneering, yet they are limited to regression-symmetric populations (with a strictly positive density) and the $\epsilon$-contamination and maximum-bias model. With a fixed finite-sample size practice, the most prevailing measure of robustness for estimators is the finite-sample breakdown point (FSBP) (Donoho and Huber (1983)). Despite many attempts made in the literature, only sporadic partial results on FSBP for $\bs{\beta}^*_{RD}$ were obtained whereas an exact FSBP for $\bs{\beta}^*_{RD}$ remained open in the last twenty-plus years. Furthermore, is the asymptotic breakdown value $1/3$ (the limit of an increasing sequence of finite-sample breakdown values) relevant in the finite-sample practice? (Or what is the difference between the finite-sample and the limit breakdown values?). Such discussions are yet to be given in the literature. This article addresses the above issues, revealing an intrinsic connection between the regression depth of $\bs{\beta}^*_{RD}$ and the newly obtained exact FSBP. It justifies the employment of $\bs{\beta}^*_{RD}$ as a robust alternative to the traditional estimators and demonstrates the necessity and the merit of using the FSBP in finite-sample real practice.

翻译：（翻译）回归深度 (RD) 引入的最浅深度估计方法 (即深度中位数) ($\bs{\beta}^*_{RD}$) 是回归中最常使用的估计器之一，具有类似于单变量位置的出色鲁棒性。事实上，$\bs{\beta}^*_{RD}$ 能够在渐进意义下抵御$33\%$的污染无需破裂，这与传统（最小二乘和最小绝对偏差）估算器的 $0\%$是形成对比的（参见 Van Aelst 和 Rousseeuw，2000）。 Van Aelst 和 Rousseeuw（2000）的结果是开创性的，但仅限于回归对称群体（具有严格正密度）和 $\epsilon$-污染和最大偏差模型。在有限样本的情况下，估计器的最常见鲁棒性指标是有限样本破坏点（FSBP） Donoho 和Huber（1983）。虽然已经在文献中做出了许多尝试，但仅在 $\bs{\beta}^*_{RD}$ 的FSBP上获得了零星的部分结果，而对于 $\bs{\beta}^*_{RD}$ 的精确 FSBP 还未得到解决。此外，在有限样本实践中渐进破裂值为$1/3$（渐进序列的限制），这与有限样本实践中的差异如何？（或有限样本和极限破裂点之间的区别是什么？）这些讨论在文献中尚未给出。本文解决了上述问题，揭示了回归$\bs{\beta}^*_{RD}$ 的深度与精确 FSBP 之间的内在联系。本文证明 $\bs{\beta}^*_{RD}$ 作为传统估计器的鲁棒替代品的用途、证明了有限样本实践中使用 FSBP 的必要性和优点。