在连续治疗设置中为非政策评价进行深度跳跃学习 (Deep Jump Learning for Off-Policy Evaluation in Continuous Treatment Settings)

We consider off-policy evaluation (OPE) in continuous treatment settings, such as personalized dose-finding. In OPE, one aims to estimate the mean outcome under a new treatment decision rule using historical data generated by a different decision rule. Most existing works on OPE focus on discrete treatment settings. To handle continuous treatments, we develop a novel estimation method for OPE using deep jump learning. The key ingredient of our method lies in adaptively discretizing the treatment space using deep discretization, by leveraging deep learning and multi-scale change point detection. This allows us to apply existing OPE methods in discrete treatments to handle continuous treatments. Our method is further justified by theoretical results, simulations, and a real application to Warfarin Dosing.

翻译：我们考虑在连续治疗环境中进行非政策性评估,例如个性化剂量调查。在连续治疗环境中,我们考虑的是非政策性评估(OPE),例如在个性化剂量调查中。在OPE中,我们的目标是利用不同决定规则产生的历史数据来估计新的治疗决定规则下的平均结果。关于OPE的现有工作大多侧重于离散治疗环境。在处理连续治疗时,我们利用深度跳跃学习,为OPE开发了一种新的估计方法。我们方法的关键成分在于利用深层离散,利用深层学习和多尺度改变点探测,使治疗空间适应性地离散。这使我们能够在离散治疗中应用现有的OPE方法处理连续治疗。我们的方法还有理论结果、模拟和对Warfarin Dosing的真正应用。

相关内容

Continuity

关注 4

让 iOS 8 和 OS X Yosemite 无缝切换的一个新特性。 > Apple products have always been designed to work together beautifully. But now they may really surprise you. With iOS 8 and OS X Yosemite, you’ll be able to do more wonderful things than ever before.

Source: Apple - iOS 8

【AAAI2021】自校正Q学习，Self-correcting Q-Learning

专知会员服务

17+阅读 · 2020年12月4日

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

【硬核课】机器人学习课程，UT Austin朱玉可博士讲述自主机器人的人工智能与机器学习机器学习算法

专知会员服务

40+阅读 · 2020年9月21日

【牛津大学博士论文】基于强化学习的无地图机器人导航，Reinforcement Learning Based MRN

专知会员服务

121+阅读 · 2020年5月18日