A Bayesian network is a directed acyclic graph that represents statistical dependencies between variables of a joint probability distribution. A fundamental task in data science is to learn a Bayesian network from observed data. \textsc{Polytree Learning} is the problem of learning an optimal Bayesian network that fulfills the additional property that its underlying undirected graph is a forest. In this work, we revisit the complexity of \textsc{Polytree Learning}. We show that \textsc{Polytree Learning} can be solved in $3^n \cdot |I|^{\mathcal{O}(1)}$ time where $n$ is the number of variables and $|I|$ is the total instance size. Moreover, we consider the influence of the number of variables $d$ that might receive a nonempty parent set in the final DAG on the complexity of \textsc{Polytree Learning}. We show that \textsc{Polytree Learning} has no $f(d)\cdot |I|^{\mathcal{O}(1)}$-time algorithm, unlike Bayesian network learning which can be solved in $2^d \cdot |I|^{\mathcal{O}(1)}$ time. We show that, in contrast, if $d$ and the maximum parent set size are bounded, then we can obtain efficient algorithms.
翻译:Bayesian 网络是一个定向的环绕图, 它代表了共同概率分布变量之间的统计依赖性。 数据科学的一个基本任务是从观察到的数据中学习 Bayesian 网络 。\ textsc{ mathcal{O}} 是学习一个最佳的 Bayesian 网络的问题, 这个网络可以满足其基础非方向图形为森林的额外属性。 在这项工作中, 我们重新审视\ textsc{ Polylearning} 的复杂性 。 我们显示\ textsc{poly relearning} 可以在 $3\ cdt { ⁇ mathcal{O} 时间里从观察的数据中学习 Bayesian 网络。 $n 是变量数, $ 和 $ $I\\\\\\\\ 美元是总实例大小。 此外, 我们考虑在最终 DAG 中可能收到关于 & textc{polylein learlear} 复杂性的变量数量的影响 。 我们显示, we can cloces request nudeal lax commax commax commax 2_O}