There are many important high dimensional function classes that have fast agnostic learning algorithms when strong assumptions on the distribution of examples can be made, such as Gaussianity or uniformity over the domain. But how can one be sufficiently confident that the data indeed satisfies the distributional assumption, so that one can trust in the output quality of the agnostic learning algorithm? We propose a model by which to systematically study the design of tester-learner pairs $(\mathcal{A},\mathcal{T})$, such that if the distribution on examples in the data passes the tester $\mathcal{T}$ then one can safely trust the output of the agnostic learner $\mathcal{A}$ on the data. To demonstrate the power of the model, we apply it to the classical problem of agnostically learning halfspaces under the standard Gaussian distribution and present a tester-learner pair with a combined run-time of $n^{\tilde{O}(1/\epsilon^4)}$. This qualitatively matches that of the best known ordinary agnostic learning algorithms for this task. In contrast, finite sample Gaussian distribution testers do not exist for the $L_1$ and EMD distance measures. A key step in the analysis is a novel characterization of concentration and anti-concentration properties of a distribution whose low-degree moments approximately match those of a Gaussian. We also use tools from polynomial approximation theory. In contrast, we show strong lower bounds on the combined run-times of tester-learner pairs for the problems of agnostically learning convex sets under the Gaussian distribution and for monotone Boolean functions under the uniform distribution over $\{0,1\}^n$. Through these lower bounds we exhibit natural problems where there is a dramatic gap between standard agnostic learning run-time and the run-time of the best tester-learner pair.
翻译:有许多重要的高维函数类, 当对示例分布的强烈假设可以做出时, 具有快速的不可知性学习算法 。 但是, 人们怎么能足够自信地相信数据确实满足了分布性假设, 这样人们就可以信任该不可知性学习算法的输出质量? 我们提出了一个模型, 用来系统研究测试- lear 配对的设计 $ (\ mathcal{A},\ mathcal{T} $ 。 这样, 当数据分布的示例的分布通过测试者 $\ mathcal{T} 时, 人们就可以安全地信任数据上的不可知性学习者 $\ mathcal{A} 的输出结果 。 为了展示模型的能量, 我们把它应用到在标准高尔氏分布下的经典学习半空空间的经典问题, 并且用一个测试- 测试- 双级双级的混合运行时间 {O} (1/\\ epclonal- 4} 这样的数据流流流 。 这个质量运算中, 也显示已知的正常的磁性分析 。