We study active structure learning of Bayesian networks in an observational setting, in which there are external limitations on the number of variable values that can be observed from the same sample. Random samples are drawn from the joint distribution of the network variables, and the algorithm iteratively selects which variables to observe in the next sample. We propose a new active learning algorithm for this setting, that finds with a high probability a structure with a score that is $\epsilon$-close to the optimal score. We show that for a class of distributions that we term stable, a sample complexity reduction of up to a factor of $\widetilde{\Omega}(d^3)$ can be obtained, where $d$ is the number of network variables. We further show that in the worst case, the sample complexity of the active algorithm is guaranteed to be almost the same as that of a naive baseline algorithm. To supplement the theoretical results, we report experiments that compare the performance of the new active algorithm to the naive baseline and demonstrate the sample complexity improvements. Code for the algorithm and for the experiments is provided at https://github.com/noabdavid/activeBNSL.
翻译:我们研究在观察环境中对巴伊西亚网络的积极结构学习,在观察环境中,可以从同一样本中观测到变量值的数量有外部限制。随机抽样是从网络变量的联合分布中抽取的随机样本,并迭代地选择下一个样本中要观察的变量。我们为这一设置建议一种新的主动学习算法,这种算法极有可能发现一个得分为$-epsilon$-close to the 最佳得分的结构。我们显示,对于我们称为稳定分布的类别,可以抽样减少复杂程度,最高可达$(d3)的系数,其中美元是网络变量的数量。我们进一步表明,在最坏的情况下,活动算法的抽样复杂性保证与天真的基线算法几乎相同。为了补充理论结果,我们报告将新积极算法的性与天真基线进行比较的实验,并展示抽样复杂性的改进。计算法和实验的代码由 https://github.com/noabdavid/pactive/BNS提供。