We consider outlier-robust and sparse estimation of linear regression coefficients, when covariate vectors and noises are sampled, respectively, from an $\mathfrak{L}$-subGaussian distribution and a heavy-tailed distribution. Additionally, the covariate vectors and noises are contaminated by adversarial outliers. We deal with two cases: the covariance matrix of the covariates is known or unknown. Particularly, in the known case, our estimator can attain a nearly information theoretical optimal error bound, and our error bound is sharper than those of earlier studies dealing with similar situations. Our estimator analysis relies heavily on generic chaining to derive sharp error bounds.
翻译:在本文中,我们考虑当自变量向量和误差分别来自 $\mathfrak{L}$-sub高斯分布和重尾分布且都受到对抗性异常值的污染时,如何进行异常值鲁棒且稀疏估计。我们考虑两种情况:自变量向量的协方差矩阵已知或未知。特别地,在协方差矩阵已知的情况下,我们的估计器可以达到接近信息理论最优的误差界,而且我们的误差界比早期研究所涉及的类似情况更加精确。我们的估计器分析在很大程度上依赖于通用链接技术,以得出更为精确的误差界。