Bayesian networks (BNs) are a widely used graphical model in machine learning for representing knowledge with uncertainty. The mainstream BN structure learning methods require performing a large number of conditional independence (CI) tests. The learning process is very time-consuming, especially for high-dimensional problems, which hinders the adoption of BNs to more applications. Existing works attempt to accelerate the learning process with parallelism, but face issues including load unbalancing, costly atomic operations and dominant parallel overhead. In this paper, we propose a fast solution named Fast-BNS on multi-core CPUs to enhance the efficiency of the BN structure learning. Fast-BNS is powered by a series of efficiency optimizations including (i) designing a dynamic work pool to monitor the processing of edges and to better schedule the workloads among threads, (ii) grouping the CI tests of the edges with the same endpoints to reduce the number of unnecessary CI tests, (iii) using a cache-friendly data storage to improve the memory efficiency, and (iv) generating the conditioning sets on-the-fly to avoid extra memory consumption. A comprehensive experimental study shows that the sequential version of Fast-BNS is up to 50 times faster than its counterpart, and the parallel version of Fast-BNS achieves 4.8 to 24.5 times speedup over the state-of-the-art multi-threaded solution. Moreover, Fast-BNS has a good scalability to the network size as well as sample size. Fast-BNS source code is freely available at https://github.com/jjiantong/FastBN.
翻译:Bayesian 网络( Bans) 是广泛使用的机器学习的图形模型, 用以代表不确定的知识。 主流 BN 结构学习方法需要大量有条件独立测试( CI) 。 学习过程非常耗时, 特别是对于阻碍采用 BN 应用程序的高度问题, 妨碍采用更多的应用程序。 现有工作试图以平行方式加速学习过程, 但面临一些问题, 包括不平衡、 昂贵的原子操作和主要的平行管理。 本文中, 我们提议在多核心 CUP 上采用快速解决方案, 名为 Fast- BNS, 以提高 BN 结构学习的效率。 快速BNS 需要一系列效率优化, 包括 (一) 设计动态工作池, 以监测边缘处理, 并更好地安排线间工作量。 (二) 将边端的 CIS 测试与相同的端点分组, 以减少不必要的 CCI 测试的数量。 (三) 使用缓存式数据存储器来提高记忆效率, 以及 (四) 生成连接机上的调器, 以避免额外的记忆消耗。 快速优化优化 优化 优化 。 全面实验研究显示, 快速 S- fal- s- sal- fal- sal- s- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- s- 版本 版本 版本