We give an $n^{O(\log\log n)}$-time membership query algorithm for properly and agnostically learning decision trees under the uniform distribution over $\{\pm 1\}^n$. Even in the realizable setting, the previous fastest runtime was $n^{O(\log n)}$, a consequence of a classic algorithm of Ehrenfeucht and Haussler. Our algorithm shares similarities with practical heuristics for learning decision trees, which we augment with additional ideas to circumvent known lower bounds against these heuristics. To analyze our algorithm, we prove a new structural result for decision trees that strengthens a theorem of O'Donnell, Saks, Schramm, and Servedio. While the OSSS theorem says that every decision tree has an influential variable, we show how every decision tree can be "pruned" so that every variable in the resulting tree is influential.
翻译:我们给出了美元O( log\ log n) $- 时间会籍查询算法, 用于在制服分配下正确和神学地学习决策树 $pm 1 ⁇ n$。 即使在可以实现的环境下, 前一个最快运行时间是 $n ⁇ O( log n) $$ $, 这是Ehrenfeucht 和 Haussler 经典算法的结果 。 我们的算法与学习决策树的实用超自然学有相似之处。 我们增加更多的想法, 以绕过已知的较低界限来抵抗这些超自然学。 为了分析我们的算法, 我们证明决策树的新结构结果强化了O' Donnell、Saks、Schramm 和Servidio的理论。 尽管 OSS 理论指出, 每个决定树都有有影响力的变量, 我们展示了每个决定树是如何“ 跑动 ” 的, 从而导致树中的每一变量都具有影响力 。