通过pT学习估计最佳地平平线动态处理制度 (Estimating Optimal Infinite Horizon Dynamic Treatment Regimes via pT-Learning) - 专知论文

会员服务 ·

0

估计/估计量 · 优化器 · Extensibility · 稀疏 · Performer ·

2021 年 10 月 20 日

Estimating Optimal Infinite Horizon Dynamic Treatment Regimes via pT-Learning

翻译：通过pT学习估计最佳地平平线动态处理制度

Wenzhuo Zhou,Ruoqing Zhu,Annie Qu

Recent advances in mobile health (mHealth) technology provide an effective way to monitor individuals' health statuses and deliver just-in-time personalized interventions. However, the practical use of mHealth technology raises unique challenges to existing methodologies on learning an optimal dynamic treatment regime. Many mHealth applications involve decision-making with large numbers of intervention options and under an infinite time horizon setting where the number of decision stages diverges to infinity. In addition, temporary medication shortages may cause optimal treatments to be unavailable, while it is unclear what alternatives can be used. To address these challenges, we propose a Proximal Temporal consistency Learning (pT-Learning) framework to estimate an optimal regime that is adaptively adjusted between deterministic and stochastic sparse policy models. The resulting minimax estimator avoids the double sampling issue in the existing algorithms. It can be further simplified and can easily incorporate off-policy data without mismatched distribution corrections. We study theoretical properties of the sparse policy and establish finite-sample bounds on the excess risk and performance error. The proposed method is implemented by our proximalDTR package and is evaluated through extensive simulation studies and the OhioT1DM mHealth dataset.

翻译：移动保健(保健)技术的最近进展为监测个人健康状况和及时提供个性化干预提供了有效途径,但是,实际使用保健技术对学习最佳动态治疗制度的现有方法提出了独特的挑战。许多保健应用涉及决策,有许多干预备选办法,在无限的时间范围内,决策阶段的数目不同至无限;此外,临时药品短缺可能导致无法获得最佳治疗,尽管还不清楚可以使用何种替代方法。为应对这些挑战,我们提议了一个极佳的时空一致性学习(pT-learn)框架,以估计一种在确定性和随机性稀薄政策模型之间适应调整的最佳制度。由此产生的微量估计器避免了现有算法中的双重抽样问题。它可以进一步简化,很容易地纳入离政策数据,而不会出现不匹配的分配纠正。我们研究稀疏政策的理论性质,并针对超重风险和性差错定的界限。拟议方法由我们的proximDTR软件包实施,并通过广泛的模拟研究和SOM1数据模型评估。

0

相关内容

估计/估计量

估计/估计量

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

专知会员服务

69+阅读 · 2021年3月27日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

【北京大学】Locally Differentially Private (Contextual) Bandits Learning

【北京大学】Locally Differentially Private (Contextual) Bandits Learning

专知会员服务

13+阅读 · 2020年6月8日

深度强化学习策略梯度教程，53页ppt

深度强化学习策略梯度教程，53页ppt

专知会员服务

184+阅读 · 2020年2月1日

【ICCV 2019】贝叶斯优化的1-Bit CNNs 《Bayesian Optimized 1-Bit CNNs》

【ICCV 2019】贝叶斯优化的1-Bit CNNs 《Bayesian Optimized 1-Bit CNNs》

专知会员服务

16+阅读 · 2019年11月17日

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

10+阅读 · 2019年10月24日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

已删除

架构文摘

3+阅读 · 2019年4月17日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

利用动态深度学习预测金融时间序列基于Python

利用动态深度学习预测金融时间序列基于Python

量化投资与机器学习

18+阅读 · 2018年10月30日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【论文】图上的表示学习综述

【论文】图上的表示学习综述

机器学习研究会

15+阅读 · 2017年9月24日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

强化学习 cartpole_a3c

强化学习 cartpole_a3c

CreateAMind

9+阅读 · 2017年7月21日

Guaranteed a posteriori local error estimation for finite element solutions of boundary value problems

Arxiv

0+阅读 · 2021年12月16日

A Targeted Approach to Confounder Selection for High-Dimensional Data

Arxiv

0+阅读 · 2021年12月15日

Generalized Kernel Ridge Regression for Nonparametric Structural Functions and Semiparametric Treatment Effects

Arxiv

0+阅读 · 2021年12月14日

Human Pose Regression with Residual Log-likelihood Estimation

Arxiv

4+阅读 · 2021年7月26日

Federated Learning with Fair Averaging

Arxiv

7+阅读 · 2021年4月30日

Maximizing Marginal Fairness for Dynamic Learning to Rank

Arxiv

7+阅读 · 2021年2月18日

Hierarchical Adaptive Contextual Bandits for Resource Constraint based Recommendation

Arxiv

5+阅读 · 2020年4月2日

Optimization for deep learning: theory and algorithms

Optimization for deep learning: theory and algorithms

Arxiv

106+阅读 · 2019年12月19日

Information-Directed Exploration for Deep Reinforcement Learning

Information-Directed Exploration for Deep Reinforcement Learning

Arxiv

5+阅读 · 2018年12月18日

Deep Reinforcement Learning in Ice Hockey for Context-Aware Player Evaluation

Deep Reinforcement Learning in Ice Hockey for Context-Aware Player Evaluation

Arxiv

5+阅读 · 2018年7月11日

VIP会员

文章信息

相关主题

估计/估计量

相关VIP内容

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

专知会员服务

69+阅读 · 2021年3月27日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

【北京大学】Locally Differentially Private (Contextual) Bandits Learning

【北京大学】Locally Differentially Private (Contextual) Bandits Learning

专知会员服务

13+阅读 · 2020年6月8日

深度强化学习策略梯度教程，53页ppt

深度强化学习策略梯度教程，53页ppt

专知会员服务

184+阅读 · 2020年2月1日

【ICCV 2019】贝叶斯优化的1-Bit CNNs 《Bayesian Optimized 1-Bit CNNs》

【ICCV 2019】贝叶斯优化的1-Bit CNNs 《Bayesian Optimized 1-Bit CNNs》

专知会员服务

16+阅读 · 2019年11月17日

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

10+阅读 · 2019年10月24日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

新书册《几何深度学习的数学基础》

中程单向攻击无人机的战略意义：俄乌战争启示

在无标注条件下适配视觉—语言模型：全面综述

面向视觉语言模型的持续学习：遗忘之外的综述与分类体系

相关资讯

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

已删除

架构文摘

3+阅读 · 2019年4月17日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

利用动态深度学习预测金融时间序列基于Python

利用动态深度学习预测金融时间序列基于Python

量化投资与机器学习

18+阅读 · 2018年10月30日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【论文】图上的表示学习综述

【论文】图上的表示学习综述

机器学习研究会

15+阅读 · 2017年9月24日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

强化学习 cartpole_a3c

强化学习 cartpole_a3c

CreateAMind

9+阅读 · 2017年7月21日

相关论文

Guaranteed a posteriori local error estimation for finite element solutions of boundary value problems

Arxiv

0+阅读 · 2021年12月16日

A Targeted Approach to Confounder Selection for High-Dimensional Data

Arxiv

0+阅读 · 2021年12月15日

Generalized Kernel Ridge Regression for Nonparametric Structural Functions and Semiparametric Treatment Effects

Arxiv

0+阅读 · 2021年12月14日

Human Pose Regression with Residual Log-likelihood Estimation

Arxiv

4+阅读 · 2021年7月26日

Federated Learning with Fair Averaging

Arxiv

7+阅读 · 2021年4月30日

Maximizing Marginal Fairness for Dynamic Learning to Rank

Arxiv

7+阅读 · 2021年2月18日

Hierarchical Adaptive Contextual Bandits for Resource Constraint based Recommendation

Arxiv

5+阅读 · 2020年4月2日

Optimization for deep learning: theory and algorithms

Optimization for deep learning: theory and algorithms

Arxiv

106+阅读 · 2019年12月19日

Information-Directed Exploration for Deep Reinforcement Learning

Information-Directed Exploration for Deep Reinforcement Learning

Arxiv

5+阅读 · 2018年12月18日

Deep Reinforcement Learning in Ice Hockey for Context-Aware Player Evaluation

Deep Reinforcement Learning in Ice Hockey for Context-Aware Player Evaluation

Arxiv

5+阅读 · 2018年7月11日

微信扫码咨询专知VIP会员