We give a quasipolynomial-time algorithm for learning stochastic decision trees that is optimally resilient to adversarial noise. Given an $\eta$-corrupted set of uniform random samples labeled by a size-$s$ stochastic decision tree, our algorithm runs in time $n^{O(\log(s/\varepsilon)/\varepsilon^2)}$ and returns a hypothesis with error within an additive $2\eta + \varepsilon$ of the Bayes optimal. An additive $2\eta$ is the information-theoretic minimum. Previously no non-trivial algorithm with a guarantee of $O(\eta) + \varepsilon$ was known, even for weaker noise models. Our algorithm is furthermore proper, returning a hypothesis that is itself a decision tree; previously no such algorithm was known even in the noiseless setting.
翻译:我们给出了一种准极时算法,用于学习最能适应对抗性噪音的随机决定树。鉴于一套以大小-美元为标签的统一随机样本,标注为大小-美元随机树,我们的算法可以及时运行 $O(s/\ varepsilon)/\\ varepsilon=2美元),并返回一种假设,即Bayes 最佳的添加值 2\et +\ varepsilon$ 中存在错误。一个2\ eta$是信息理论最低值。以前没有保证值为 $O(eta) +\ varepsilon 的非三角算法,即使对于较弱的噪音模型也是如此。此外,我们的算法是正确的,还返回了一种假设,即它本身是决策树;以前即使在无噪音的环境中,这种算法也并不为人所知。