In list-decodable learning, we are given a set of data points such that an $α$-fraction of these points come from a nice distribution $D$, for some small $α\ll 1$, and the goal is to output a short list of candidate solutions, such that at least one element of this list recovers some non-trivial information about $D$. By now, there is a large body of work on this topic; however, while many algorithms can achieve optimal list size in terms of $α$, all known algorithms must incur error which decays, in some cases quite poorly, with $1 / α$. In this paper, we ask if this is inherent: is it possible to trade off list size with accuracy in list-decodable learning? More formally, given $ε> 0$, can we can output a slightly larger list in terms of $α$ and $ε$, but so that one element of this list has error at most $ε$ with the ground truth? We call this problem high-accuracy list-decodable learning. Our main result is that non-trivial high-accuracy guarantees, both information-theoretically and algorithmically, are possible for the canonical setting of list-decodable mean estimation of identity-covariance Gaussians. Specifically, we demonstrate that there exists a list of candidate means of size at most $L = \exp \left( O\left( \tfrac{\log^2 1 / α}{ε^2} \right)\right)$ so that one of the elements of this list has $\ell_2$ distance at most $ε$ to the true mean. We also design an algorithm that outputs such a list with runtime and sample complexity $n = d^{O(\log L)} + \exp \exp (\widetilde{O}(\log L))$. We do so by demonstrating a completely novel proof of identifiability, as well as a new algorithmic way of leveraging this proof without the sum-of-squares hierarchy, which may be of independent technical interest.
翻译:在列表可解码学习中,我们给定一个数据集,其中仅有比例$α$(通常$α\\ll 1$)的数据点来自一个良性分布$D$,目标是输出一个候选解的短列表,使得列表中至少有一个元素能够恢复关于$D$的非平凡信息。迄今为止,该领域已有大量研究成果;然而,尽管许多算法能够在列表大小方面达到关于$α$的最优性,但所有已知算法都必须承受随着$1/α$增大而衰减的误差,在某些情况下衰减效果相当差。本文探讨这一现象是否具有本质性:在列表可解码学习中,是否可能通过增大列表规模来换取更高的估计精度?更形式化地说,给定$ε>0$,我们能否输出一个在$α$和$ε$意义上稍大的列表,使得列表中至少有一个元素与真实值的误差不超过$ε$?我们将此问题称为高精度列表可解码学习。我们的主要结果表明,对于协方差矩阵为单位阵的高斯分布这一典型场景的列表可解码均值估计问题,无论在信息论层面还是算法层面,都存在非平凡的高精度保证。具体而言,我们证明存在一个规模至多为$L = \\exp \\left( O\\left( \\tfrac{\\log^2 1 / α}{ε^2} \\right)\\right)$的候选均值列表,使得其中某个元素与真实均值的$\\ell_2$距离不超过$ε$。同时,我们设计了一种算法,其运行时间和样本复杂度为$n = d^{O(\\log L)} + \\exp \\exp (\\widetilde{O}(\\log L))$,能够输出满足该性质的列表。我们通过构建一种全新的可辨识性证明方法,以及一种不依赖平方和层级的新型算法框架来实现这一目标,该技术路径可能具有独立的学术价值。