Restless Multi-Armed Bandits (RMAB) is an apt model to represent decision-making problems in public health interventions (e.g., tuberculosis, maternal, and child care), anti-poaching planning, sensor monitoring, personalized recommendations and many more. Existing research in RMAB has contributed mechanisms and theoretical results to a wide variety of settings, where the focus is on maximizing expected value. In this paper, we are interested in ensuring that RMAB decision making is also fair to different arms while maximizing expected value. In the context of public health settings, this would ensure that different people and/or communities are fairly represented while making public health intervention decisions. To achieve this goal, we formally define the fairness constraints in RMAB and provide planning and learning methods to solve RMAB in a fair manner. We demonstrate key theoretical properties of fair RMAB and experimentally demonstrate that our proposed methods handle fairness constraints without sacrificing significantly on solution quality.
翻译:多武装强盗(RMAB)是代表公共卫生干预(如结核病、孕产妇和儿童护理)、反偷猎规划、感应监测、个性化建议和其他许多方面决策问题的适当模式。RMAB的现有研究为各种环境提供了机制和理论结果,这些环境的重点是尽量扩大预期价值。在本文件中,我们有兴趣确保RMAB的决策对不同武器也公平,同时尽量扩大预期价值。在公共卫生环境中,这将确保不同的人民和/或社区在公共卫生干预决策中具有公平代表性。为实现这一目标,我们正式界定RMAB的公平限制,并提供规划和学习方法,以公平的方式解决RMAB。我们展示了公平RMAB的关键理论特性,并实验性地证明我们提出的方法在不牺牲解决方案质量的情况下处理公平限制。