In this work, we study the problem of robustly estimating the mean/location parameter of distributions without moment bounds. For a large class of distributions satisfying natural symmetry constraints we give a sequence of algorithms that can efficiently estimate its location without incurring dimension-dependent factors in the error. Concretely, suppose an adversary can arbitrarily corrupt an $\varepsilon$-fraction of the observed samples. For every $k \in \mathbb{N}$, we design an estimator using time and samples $\tilde{O}({d^k})$ such that the dependence of the error on the corruption level $\varepsilon$ is an additive factor of $O(\varepsilon^{1-\frac{1}{2k}})$. The dependence on other problem parameters is also nearly optimal. Our class contains products of arbitrary symmetric one-dimensional distributions as well as elliptical distributions, a vast generalization of the Gaussian distribution. Examples include product Cauchy distributions and multi-variate $t$-distributions. In particular, even the first moment might not exist. We provide the first efficient algorithms for this class of distributions. Previously, such results where only known under boundedness assumptions on the moments of the distribution and in particular, are provably impossible in the absence of symmetry [KSS18, CTBJ22]. For the class of distributions we consider, all previous estimators either require exponential time or incur error depending on the dimension. Our algorithms are based on a generalization of the filtering technique [DK22]. We show how this machinery can be combined with Huber-loss-based approach to work with projections of the noise. Moreover, we show how sum-of-squares proofs can be used to obtain algorithmic guarantees even for distributions without first moment. We believe that this approach may find other application in future works.
翻译:在这项工作中, 我们研究如何用时间和样本 $\ tilde{ O} ({d}k}) 来强有力地估计发行量的平均值/ 地点参数。 对于满足自然对称约束的一大批分配类别来说, 我们给出了一系列算法, 可以有效估计其位置, 而不会在错误中产生维度因素。 具体地说, 假设对手可以任意腐蚀所观测到的样品的美元- varepsilon$- 折射值。 对于每一个$k\ in \ \ mathb{N} 来说, 我们设计了一个使用时间和样本 $\ talde{O} ({{d}) 的估测仪。 对于腐败等级的错误依赖 $\ varepslonalon$( $) 是美元( varepslon) 的累加系数。 。 具体地说, 对其它问题参数的依赖性一维度分布的产体, 以及基于星基分布方式的分布方式, 我们首先可以发现产品和多变价值的分布, 。