We present a new finite-sample analysis of M-estimators of locations in $\mathbb{R}^d$ using the tool of the influence function. In particular, we show that the deviations of an M-estimator can be controlled thanks to its influence function (or its score function) and then, we use concentration inequality on M-estimators to investigate the robust estimation of the mean in high dimension in a corrupted setting (adversarial corruption setting) for bounded and unbounded score functions. For a sample of size $n$ and covariance matrix $\Sigma$, we attain the minimax speed $\sqrt{Tr(\Sigma)/n}+\sqrt{\|\Sigma\|_{op}\log(1/\delta)/n}$ with probability larger than $1-\delta$ in a heavy-tailed setting. One of the major advantages of our approach compared to others recently proposed is that our estimator is tractable and fast to compute even in very high dimension with a complexity of $O(nd\log(Tr(\Sigma)))$ where $n$ is the sample size and $\Sigma$ is the covariance matrix of the inliers. In practice, the code that we make available for this article proves to be very fast.
翻译:我们用影响函数工具对美元(mathbb{R ⁇ d$)的测算器进行了新的有限抽样分析。 特别是, 我们显示, M- 测算器的偏差可以通过影响函数( 或分数函数) 来控制。 然后, 我们用M- 估测器的集中不平等来调查对受约束和未受约束的得分函数在腐败环境中( 对抗性腐败设置) 高维度平均值的可靠估计。 对于以美元和共差矩阵( 美元) 的抽样来说, 我们的测算器既简单又迅速, 也非常高的尺寸与美元( 美元) 的复杂度( 美元) 达到迷你速 $( tr) / n ⁇ qr( sqr) / náqrqr ⁇ sgigmaççççççççlog( 1/\\\\ delta)/n 美元( 美元) 的偏差, 概率大于1\ delta$( ) 美元, 与最近提议的其他方法相比, 我们的主要优点是, 我们的估测算器可以快速的比。