We give the first efficient algorithm for learning halfspaces in the testable learning model recently defined by Rubinfeld and Vasilyan (2023). In this model, a learner certifies that the accuracy of its output hypothesis is near optimal whenever the training set passes an associated test, and training sets drawn from some target distribution -- e.g., the Gaussian -- must pass the test. This model is more challenging than distribution-specific agnostic or Massart noise models where the learner is allowed to fail arbitrarily if the distributional assumption does not hold. We consider the setting where the target distribution is Gaussian (or more generally any strongly log-concave distribution) in $d$ dimensions and the noise model is either Massart or adversarial (agnostic). For Massart noise, our tester-learner runs in polynomial time and outputs a hypothesis with (information-theoretically optimal) error $\mathsf{opt} + \epsilon$ for any strongly log-concave target distribution. For adversarial noise, our tester-learner obtains error $O(\mathsf{opt}) + \epsilon$ in polynomial time when the target distribution is Gaussian; for strongly log-concave distributions, we obtain $\tilde{O}(\mathsf{opt}) + \epsilon$ in quasipolynomial time. Prior work on testable learning ignores the labels in the training set and checks that the empirical moments of the covariates are close to the moments of the base distribution. Here we develop new tests of independent interest that make critical use of the labels and combine them with the moment-matching approach of Gollakota et al. (2023). This enables us to simulate a variant of the algorithm of Diakonikolas et al. (2020) for learning noisy halfspaces using nonconvex SGD but in the testable learning setting.
翻译:我们给出了第一个用于学习可测试学习模型中半空的高效算法 。 在Rubinfeld 和 Vasilyan (2023) 最近定义的测试学习模型中, 我们给出了第一个用于学习半空的高效算法 。 在这个模型中, 学习者确认, 当培训设置通过相关测试时, 其输出假设的准确性接近最佳, 而从某些目标分布( 例如, Gaussian -- ) 中抽取的培训组 -- -- 例如, Gaussian -- -- 必须通过测试。 这个模型比分配特定分配的亚特异或Massart 噪音模型更具挑战性。 如果分布假设不维持, 则允许学习者任意失败 。 我们考虑目标分布的设置是 $( 或更强烈的) 数字- 数字- 数字- 数字- 运算法的计算结果, 用于任何强烈的日志分布 。 对于对数值- 数字- 的测试- 值- 数字- 运算的运算算算算算算算值, 的运算算算算算算算算算算算算算算算算算算算算算算算值的计算值的计算值的计算, 流流流流流值的计算值的计算值是, 。</s>