A remarkable recent paper by Rubinfeld and Vasilyan (2022) initiated the study of \emph{testable learning}, where the goal is to replace hard-to-verify distributional assumptions (such as Gaussianity) with efficiently testable ones and to require that the learner succeed whenever the unknown distribution passes the corresponding test. In this model, they gave an efficient algorithm for learning halfspaces under testable assumptions that are provably satisfied by Gaussians. In this paper we give a powerful new approach for developing algorithms for testable learning using tools from moment matching and metric distances in probability. We obtain efficient testable learners for any concept class that admits low-degree \emph{sandwiching polynomials}, capturing most important examples for which we have ordinary agnostic learners. We recover the results of Rubinfeld and Vasilyan as a corollary of our techniques while achieving improved, near-optimal sample complexity bounds for a broad range of concept classes and distributions. Surprisingly, we show that the information-theoretic sample complexity of testable learning is tightly characterized by the Rademacher complexity of the concept class, one of the most well-studied measures in statistical learning theory. In particular, uniform convergence is necessary and sufficient for testable learning. This leads to a fundamental separation from (ordinary) distribution-specific agnostic learning, where uniform convergence is sufficient but not necessary.
翻译:Rubinfeld 和 Vasilyan (2022年) 最近一份令人瞩目的论文 Rubinfeld 和 Vasilyan (2022年) 开始研究 emph{ 可测试的学习}, 目的是用有效测试的假设取代难以核实的分布假设(例如高西尼), 要求学习者在未知分布通过相应测试时成功。 在这个模型中, 他们给出了一种有效的算法, 用于在高斯人可以肯定地满足的可测试假设下学习半空。 在本文中, 我们给出了一种强有力的新方法, 用于开发从瞬间匹配和可能测量距离的工具的可测试学习算法 。 我们为任何接受低度/ emph{ 和擦拭多面相混合的分类概念类获得有效的可测试学习者, 我们捕捉了我们拥有普通不可知性学习者的最重要例子。 我们恢复了Rubinfeld 和 Vasilyan 的结果, 作为我们技术的必然满足的、 近为最优化的抽样复杂性, 一系列概念的分类和分布。 。 。 值得注意的是, 我们展示的是, 一个测试性的具体的分类的精确的分类学习方法是,,, 一种必要的、 一种必要的理论的、 一种测试性、 一种测试性、 一种最精确的分级的分级的分级的分级的分级的分级的分级的分级的分级 。