We consider outlier robust and sparse estimation of linear regression coefficients when covariates and noise are sampled, respectively, from an $\mathfrak{L}$-subGaussian distribution and a heavy-tailed distribution, and additionally, the covariates and noise are contaminated by adversarial outliers. We deal with two cases: known or unknown covariance of the covariates. Particularly, in the former case, our estimator attains nearly information theoretical optimal error bound, and our error bound is sharper than that of earlier studies dealing with similar situations. Our estimator analysis relies heavily on Generic Chaining to derive sharp error bounds.
翻译:我们认为,当从美元=mathfrak{L}$uGausian的分布和重尾分配中分别抽样检测到共变和噪音时,对线性回归系数的估算就显得异常有力和少见。此外,共变和噪音还受到对立外线的污染。我们处理两种情况:共变已知或未知的共变。特别是在前一种情况下,我们的估测器几乎掌握了信息理论上的最佳误差,我们的误差比以前处理类似情况的研究的误差要清晰得多。我们的估测器分析严重依赖通用链条来得出明显的误差界限。