The spread of an undesirable contact process, such as an infectious disease (e.g. COVID-19), is contained through testing and isolation of infected nodes. The temporal and spatial evolution of the process (along with containment through isolation) render such detection as fundamentally different from active search detection strategies. In this work, through an active learning approach, we design testing and isolation strategies to contain the spread and minimize the cumulative infections under a given test budget. We prove that the objective can be optimized, with performance guarantees, by greedily selecting the nodes to test. We further design reward-based methodologies that effectively minimize an upper bound on the cumulative infections and are computationally more tractable in large networks. These policies, however, need knowledge about the nodes' infection probabilities which are dynamically changing and have to be learned by sequential testing. We develop a message-passing framework for this purpose and, building on that, show novel tradeoffs between exploitation of knowledge through reward-based heuristics and exploration of the unknown through a carefully designed probabilistic testing. The tradeoffs are fundamentally distinct from the classical counterparts under active search or multi-armed bandit problems (MABs). We provably show the necessity of exploration in a stylized network and show through simulations that exploration can outperform exploitation in various synthetic and real-data networks depending on the parameters of the network and the spread.
翻译:包含传播过程的顺序学习:利用还是探索?
不良接触过程(如COVID-19)的传播通过检测和隔离感染节点来控制。该过程的时间和空间演化(以及通过隔离的控制)使得检测与主动搜索检测策略有根本区别。在这项工作中,通过一种主动学习方法,我们设计了测试和隔离策略来控制传播,并在给定测试预算下最小化累计感染。我们证明了可以通过贪婪地选择要测试的节点来优化目标,并得到了性能保证。我们进一步设计了基于奖励的方法,这些方法可以有效地最小化累计感染的上限,并且在大型网络中具有更可计算性。然而,这些策略需要了解节点的感染概率,这些概率在动态变化,并且必须通过顺序测试进行学习。我们开发了一种消息传递框架来实现此目的,并在此基础上展示了利用基于奖励的启发式和通过精心设计的概率测试探索未知之间的新交易。这些权衡在主动搜索或多臂赌博机问题(MABs)的经典对应物中根本不同。我们在一个简化的网络中证明了探索的必要性,并通过模拟表明,在各种合成和真实数据网络中,探索可以根据网络和传播参数的不同表现出优于开发的性能。