Convex and penalized robust methods often suffer from bias induced by large outliers, limiting their effectiveness in adversarial or heavy-tailed settings. In this study, we propose a novel approach that eliminates this bias (when possible) by leveraging a non-convex $M$-estimator based on the alpha divergence. We address the problem of estimating the parameters vector in high dimensional linear regression, even when a subset of the data has been deliberately corrupted by an adversary with full knowledge of the dataset and its underlying distribution. Our primary contribution is to demonstrate that the objective function, although non-convex, exhibits convexity within a carefully chosen basin of attraction, enabling robust and unbiased estimation. Additionally, we establish three key theoretical guarantees for the estimator: (a) a deviation bound that is minimax optimal up to a logarithmic factor, (b) an improved unbiased bound when the outliers are large and (c) asymptotic normality as the sample size increases. Finally, we validate the theoretical findings through empirical comparisons with state-of-the-art estimators on both synthetic and real-world datasets, highlighting the proposed method's superior robustness, efficiency, and ability to mitigate outlier-induced bias.
翻译:暂无翻译