We study the problem of list-decodable linear regression, where an adversary can corrupt a majority of the examples. Specifically, we are given a set $T$ of labeled examples $(x, y) \in \mathbb{R}^d \times \mathbb{R}$ and a parameter $0< \alpha <1/2$ such that an $\alpha$-fraction of the points in $T$ are i.i.d. samples from a linear regression model with Gaussian covariates, and the remaining $(1-\alpha)$-fraction of the points are drawn from an arbitrary noise distribution. The goal is to output a small list of hypothesis vectors such that at least one of them is close to the target regression vector. Our main result is a Statistical Query (SQ) lower bound of $d^{\mathrm{poly}(1/\alpha)}$ for this problem. Our SQ lower bound qualitatively matches the performance of previously developed algorithms, providing evidence that current upper bounds for this task are nearly best possible.
翻译:我们研究了列表可辨别线性回归的问题, 对手可以在其中腐蚀大多数例子。 具体地说, 我们得到了一组标注的示例$( x, y) 的特价 $ (mathbb{ R ⁇ d\ times\ mathb{ R} $ 和一个参数 $ < alpha < 1/2 $ 和 参数 $ < alpha$ - 折射 $ t$ ($T$), 使得用高山共变方的线性回归模型的样本 i. d. 和 其余点的 $(1\ alpha) 的折射量来自任意的噪音分布。 目标是输出一个小的假设矢量列表, 使其中至少有一个矢量接近目标回归矢量 。 我们的主要结果就是对此问题的统计 Query (SQ) 下限 $d\ mathrm{poly} (1/\ alpha} $。 我们的定值与先前开发的算法的性效果相近乎最佳。