There are many high dimensional function classes that have fast agnostic learning algorithms when assumptions on the distribution of examples can be made, such as Gaussianity or uniformity over the domain. But how can one be confident that data indeed satisfies such assumption, so that one can trust in output quality of the agnostic learning algorithm? We propose a model by which to systematically study the design of tester-learner pairs $(\mathcal{A},\mathcal{T})$, such that if the distribution on examples in the data passes the tester $\mathcal{T}$ then one can safely trust the output of the agnostic learner $\mathcal{A}$ on the data. To demonstrate the power of the model, we apply it to the classical problem of agnostically learning halfspaces under the standard Gaussian distribution and present a tester-learner pair with combined run-time of $n^{\tilde{O}(1/\epsilon^4)}$. This qualitatively matches that of the best known ordinary agnostic learning algorithms for this task. In contrast, finite sample Gaussianity testers do not exist for the $L_1$ and EMD distance measures. A key step is to show that half-spaces are well-approximated with low-degree polynomials relative to distributions with low-degree moments close to those of a Gaussian. We also go beyond spherically-symmetric distributions, and give a tester-learner pair for halfspaces under the uniform distribution on $\{0,1\}^n$ with combined run-time of $n^{\tilde{O}(1/\epsilon^4)}$. This is achieved using polynomial approximation theory and critical index machinery. We also show there exist some well-studied settings where $2^{\tilde{O}(\sqrt{n})}$ run-time agnostic learning algorithms are available, yet the combined run-times of tester-learner pairs must be as high as $2^{\Omega(n)}$. On that account, the design of tester-learner pairs is a research direction in its own right independent of standard agnostic learning.
翻译:有很多高维函数级, 当对示例分布的假设可以实现快速独立 { 度值学习算法 。 但是, 人们怎么能相信数据确实满足了这样的假设, 这样人们就可以相信数据的质量 。 我们提出一个模型, 用来系统研究测试- 阅读者配对的设计 $ (\ mathcal{A},\ mathcal{T} 美元, 比如, 如果数据分布的示例通过 $\ macal{ T}, 那么人们就可以安全地信任数据上的半层学习者 $\ mathcal{ A} 的输出。 为了展示模型的能量, 我们把它应用到标准高尚分布的 半层学习空间的经典问题 。 Westical- listal lacial lacial- democial discial excials a la discial dismal discial la excial discience, liverals a lacial disal disal dism lacial disal dismods 。, licial dismocial disal dism la las la las dismods disal dism las dism s dism s dism s d le le le le le le le le le le le le le le le le le le le le ladaldal ladal lad le le le le lex ladal le ladal ladal la lad ladaldaldaldald modal modal ladaldal ladal lad modal modal lad modal modal modal modal mod mod mod mod mod modal mod mod mod modal modal mod modal modal mod mod mod le mod