In recent years, there are many attempts to understand popular heuristics. An example of such a heuristic algorithm is the ID3 algorithm for learning decision trees. This algorithm is commonly used in practice, but there are very few theoretical works studying its behavior. In this paper, we analyze the ID3 algorithm, when the target function is a $k$-Junta, a function that depends on $k$ out of $n$ variables of the input. We prove that when $k = \log n$, the ID3 algorithm learns in polynomial time $k$-Juntas, in the smoothed analysis model of Kalai & Teng. That is, we show a learnability result when the observed distribution is a "noisy" variant of the original distribution.
翻译:近年来,人们多次试图理解流行的文理学。这种文理学算法的一个例子就是学习决策树的ID3算法。这种算法在实践中通常使用,但很少有理论作品研究它的行为。在本文中,我们分析了ID3算法,当目标函数为$k$-Junta时,这个函数从输入的n美元变量中依赖k美元。我们证明,当当美元=\log n$时,ID3算法在加莱和Teng的平滑分析模型中学习到$k$-Juntas。也就是说,当观察到的分布是原始分布的“新”变体时,我们展示了一种可学习的结果。