We study the problem of determining the best intervention in a Causal Bayesian Network (CBN) specified only by its causal graph. We model this as a stochastic multi-armed bandit (MAB) problem with side-information, where the interventions correspond to the arms of the bandit instance. First, we propose a simple regret minimization algorithm that takes as input a semi-Markovian causal graph with atomic interventions and possibly unobservable variables, and achieves $\tilde{O}(\sqrt{M/T})$ expected simple regret, where $M$ is dependent on the input CBN and could be very small compared to the number of arms. We also show that this is almost optimal for CBNs described by causal graphs having an $n$-ary tree structure. Our simple regret minimization results, both upper and lower bound, subsume previous results in the literature, which assumed additional structural restrictions on the input causal graph. In particular, our results indicate that the simple regret guarantee of our proposed algorithm can only be improved by considering more nuanced structural restrictions on the causal graph. Next, we propose a cumulative regret minimization algorithm that takes as input a general causal graph with all observable nodes and atomic interventions and performs better than the optimal MAB algorithm that does not take causal side-information into account. We also experimentally compare both our algorithms with the best known algorithms in the literature. To the best of our knowledge, this work gives the first simple and cumulative regret minimization algorithms for CBNs with general causal graphs under atomic interventions and having unobserved confounders.
翻译:我们研究在Causal Bayesian网络(CBN)中确定最佳干预方法的问题,只是其因果图所指明的。我们用侧信息来模拟这个问题,因为干预方法与土匪的手臂相对应。首先,我们提出一个简单的遗憾最小化算法,将半马尔科瓦因果图和原子干预和可能无法观察的变量作为投入,并实现美元(sqrt{O})(sqrt{M/T})预期的简单遗憾,即$M$依赖于输入的CBN,与武器数量相比,其累积性可能非常小。我们还表明,对于带有美元树结构的因果图所描述的CBBNs几乎是最佳的。我们简单的遗憾最小化算算法,它假定对输入的因果性图表有额外的结构性限制。特别是,我们提议的算法的简单遗憾保证只能通过首先考虑对因果图的侧面结构限制来改进,而与武器数量相比可能非常小。我们还表明,对于C的累积性图表来说,这几乎都是最佳的因果算算法,我们最坏的。