We give tight statistical query (SQ) lower bounds for learnining halfspaces in the presence of Massart noise. In particular, suppose that all labels are corrupted with probability at most $\eta$. We show that for arbitrary $\eta \in [0,1/2]$ every SQ algorithm achieving misclassification error better than $\eta$ requires queries of superpolynomial accuracy or at least a superpolynomial number of queries. Further, this continues to hold even if the information-theoretically optimal error $\mathrm{OPT}$ is as small as $\exp\left(-\log^c(d)\right)$, where $d$ is the dimension and $0 < c < 1$ is an arbitrary absolute constant, and an overwhelming fraction of examples are noiseless. Our lower bound matches known polynomial time algorithms, which are also implementable in the SQ framework. Previously, such lower bounds only ruled out algorithms achieving error $\mathrm{OPT} + \epsilon$ or error better than $\Omega(\eta)$ or, if $\eta$ is close to $1/2$, error $\eta - o_\eta(1)$, where the term $o_\eta(1)$ is constant in $d$ but going to 0 for $\eta$ approaching $1/2$. As a consequence, we also show that achieving misclassification error better than $1/2$ in the $(A,\alpha)$-Tsybakov model is SQ-hard for $A$ constant and $\alpha$ bounded away from 1.
翻译:我们给出了严格的统计查询( SQ ), 用于在 Massart 噪音面前学习半空。 特别是, 假设所有标签的损坏概率都以美元为单位, 概率最高为$。 我们显示, 任意 $\eta = $[ 0. 1/ 2美元], 出现错误分类错误的每个 SQ 算法都比 $ 0. eta 的准确性更高, 或至少是一个超级球度的查询次数。 此外, 即使信息- 理论最佳差错 $\ mathrm{ OPT} $ 小于 美元( log_ c (d)\right) 美元, 概率均以美元为单位, 美元为单位, 美元 < c < 1 美元 绝对不变不变值, 绝大多数的例子是无声的。 我们已知的多元时间算法更低, 也可以在 SQ 框架中执行。 这样的下限仅排除了计算错误 $( mateffer) $( $) 美元) 和 美元( 美元) 美元( 美元) 美元) 美元比 美元接近 美元或 美元(美元) 美元(美元) 美元(美元) 直至 美元) 美元) 。