Contextual bandits often provide simple and effective personalization in decision making problems, making them popular tools to deliver personalized interventions in mobile health as well as other health applications. However, when bandits are deployed in the context of a scientific study -- e.g. a clinical trial to test if a mobile health intervention is effective -- the aim is not only to personalize for an individual, but also to determine, with sufficient statistical power, whether or not the system's intervention is effective. It is essential to assess the effectiveness of the intervention before broader deployment for better resource allocation. The two objectives are often deployed under different model assumptions, making it hard to determine how achieving the personalization and statistical power affect each other. In this work, we develop general meta-algorithms to modify existing algorithms such that sufficient power is guaranteed while still improving each user's well-being. We also demonstrate that our meta-algorithms are robust to various model mis-specifications possibly appearing in statistical studies, thus providing a valuable tool to study designers.
翻译:在决策问题上,背景强盗往往提供简单有效的个性化决策问题,使其在移动健康和其他健康应用方面提供个性化干预的流行工具。然而,当强盗在科学研究的背景下被部署时 -- -- 例如临床试验以测试移动健康干预是否有效时 -- -- 目的不仅是为了让个人个性化,而且还要以足够的统计力量确定系统的干预是否有效。在更广泛地部署之前,评估干预的有效性对于更好地分配资源至关重要。这两个目标往往在不同的模式假设下部署,难以确定实现个性化和统计能力如何相互影响。在这项工作中,我们开发了一般的元数字,以修改现有的算法,从而保证足够的权力,同时仍然改善每个用户的福祉。我们还表明我们的元数字数对于统计研究中可能出现的各种模型错误特性十分强大,从而为研究设计者提供了宝贵的工具。