在精密药物施用中使用强化学习方法的挑战:行动效果的拖延和延长 (On the Challenges of using Reinforcement Learning in Precision Drug Dosing: Delay and Prolongedness of Action Effects) - 专知论文

会员服务 ·

0

Markov · 查准率/准确率 · Learning · 控制器 · 强化学习 ·

2023 年 1 月 2 日

On the Challenges of using Reinforcement Learning in Precision Drug Dosing: Delay and Prolongedness of Action Effects

翻译：在精密药物施用中使用强化学习方法的挑战:行动效果的拖延和延长

Sumana Basu,Marc-André Legault,Adriana Romero-Soriano,Doina Precup

from arxiv, Accepted to AAAI 2023

Drug dosing is an important application of AI, which can be formulated as a Reinforcement Learning (RL) problem. In this paper, we identify two major challenges of using RL for drug dosing: delayed and prolonged effects of administering medications, which break the Markov assumption of the RL framework. We focus on prolongedness and define PAE-POMDP (Prolonged Action Effect-Partially Observable Markov Decision Process), a subclass of POMDPs in which the Markov assumption does not hold specifically due to prolonged effects of actions. Motivated by the pharmacology literature, we propose a simple and effective approach to converting drug dosing PAE-POMDPs into MDPs, enabling the use of the existing RL algorithms to solve such problems. We validate the proposed approach on a toy task, and a challenging glucose control task, for which we devise a clinically-inspired reward function. Our results demonstrate that: (1) the proposed method to restore the Markov assumption leads to significant improvements over a vanilla baseline; (2) the approach is competitive with recurrent policies which may inherently capture the prolonged effect of actions; (3) it is remarkably more time and memory efficient than the recurrent baseline and hence more suitable for real-time dosing control systems; and (4) it exhibits favorable qualitative behavior in our policy analysis.

翻译：使用抗逆转录病毒药物是AI的一个重要应用,可以作为强化学习(RL)问题来制定。在本文件中,我们确定了使用RL进行药物注射的两大挑战:药物管理的延迟和长期影响,这打破了Markov对RL框架的假设。我们侧重于延长和定义PAE-POMDP(长期行动效果-部分可观测的Markov决定程序),这是POMDP的一个亚类,我们为此设计了一个临床激励奖励功能。我们的成果表明:(1) 恢复Markov假设的拟议方法导致对香草基线的重大改进;(2) 我们提出将PAE-POMDP药物转化为MDP药物的简单而有效的方法,使现有RL算法能够用于解决这些问题。我们确认拟议的方法的长效任务,以及具有挑战性的胶囊控制任务。我们提出的方法表明:(1) 恢复Markov假设的拟议方法导致对香草基线的显著改进;(2) 经常政策具有竞争性,其经常发生的作用是能够更持久地反映我们经常性的基线;(3) 更具有竞争力的政策分析,从而能够更准确地反映我们长期的经常性的基准;(3) 能够更准确地反映我们的经常性的、更精确地反映我们的实际做法。

0

相关内容

Markov

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

93+阅读 · 2020年2月12日

【深度学习表格检测、信息提取和结构化】《Table Detection, Information Extraction and Structuring using Deep Learning》by Vihar Kurama

专知会员服务

38+阅读 · 2020年1月23日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

【ICIG2021】Latest News & Announcements of the Industry Talk2

【ICIG2021】Latest News & Announcements of the Industry Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年7月29日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

pH响应γ-聚谷氨酸靶向纳米载药体系用于增强抗肿瘤效应及逆转乳腺癌多药耐药的机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

单原子层贵金属修饰的钯基核壳结构纳米粒子的可控制备及电化学性能

国家自然科学基金

0+阅读 · 2013年12月31日

Mg-Zn-RE(Ce,Nd)系镁合金强化相析出过程与强化机制的研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于"Build-and-Click"法的铂类RNA聚合酶I选择性抑制剂的构建、评价及亚细胞定位研究

国家自然科学基金

1+阅读 · 2013年12月31日

宏观三维多孔类Fenton纳米催化剂的组装制备及其降解有机污染物的增效机制

国家自然科学基金

0+阅读 · 2012年12月31日

用于H2S完全分解制氢的非平衡等离子体-光催化体系构建

国家自然科学基金

0+阅读 · 2011年12月31日

螯合型和钳形六元氮杂环卡宾钯化合物及其对碳-碳成键的催化

国家自然科学基金

0+阅读 · 2011年12月31日

多氯联苯类（PCBs)污染物的辐射降解研究

国家自然科学基金

0+阅读 · 2009年12月31日

有机气体在活性炭上选择性吸附和竞争吸附的微尺度构效关系研究

国家自然科学基金

0+阅读 · 2009年12月31日

sRAGE对缺血/再灌注的心脏保护作用及其机制

国家自然科学基金

0+阅读 · 2008年12月31日

Efficient Path Planning In Manipulation Planning Problems by Actively Reusing Validation Effort

Efficient Path Planning In Manipulation Planning Problems by Actively Reusing Validation Effort

Arxiv

0+阅读 · 2023年3月1日

Mesh-SORT: Simple and effective of location-wise tracker

Arxiv

0+阅读 · 2023年3月1日

Efficient Scheduling of Data Augmentation for Deep Reinforcement Learning

Arxiv

0+阅读 · 2023年3月1日

The In-Sample Softmax for Offline Reinforcement Learning

Arxiv

0+阅读 · 2023年2月28日

On the Role of Emergent Communication for Social Learning in Multi-Agent Reinforcement Learning

Arxiv

0+阅读 · 2023年2月28日

Exposure-Based Multi-Agent Inspection of a Tumbling Target Using Deep Reinforcement Learning

Arxiv

0+阅读 · 2023年2月27日

GANterfactual-RL: Understanding Reinforcement Learning Agents' Strategies through Visual Counterfactual Explanations

Arxiv

0+阅读 · 2023年2月24日

A Survey on Causal Reinforcement Learning

Arxiv

29+阅读 · 2023年2月10日

Deep Learning for Medical Image Segmentation: Tricks, Challenges and Future Directions

Arxiv

21+阅读 · 2022年9月21日

A Survey of Reinforcement Learning Techniques: Strategies, Recent Development, and Future Directions

A Survey of Reinforcement Learning Techniques: Strategies, Recent Development, and Future Directions

Arxiv

80+阅读 · 2020年1月19日

VIP会员

文章信息

相关主题

查准率/准确率

相关VIP内容

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

93+阅读 · 2020年2月12日

【深度学习表格检测、信息提取和结构化】《Table Detection, Information Extraction and Structuring using Deep Learning》by Vihar Kurama

专知会员服务

38+阅读 · 2020年1月23日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【牛津博士论文】零样本强化学习综述

《美军条令：陆军指挥官与规划人员地理空间指南》60页

战术边缘指挥控制：防务面临的核心挑战

迈向开放世界检测：综述

相关资讯

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

【ICIG2021】Latest News & Announcements of the Industry Talk2

【ICIG2021】Latest News & Announcements of the Industry Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年7月29日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

Efficient Path Planning In Manipulation Planning Problems by Actively Reusing Validation Effort

Efficient Path Planning In Manipulation Planning Problems by Actively Reusing Validation Effort

Arxiv

0+阅读 · 2023年3月1日

Mesh-SORT: Simple and effective of location-wise tracker

Arxiv

0+阅读 · 2023年3月1日

Efficient Scheduling of Data Augmentation for Deep Reinforcement Learning

Arxiv

0+阅读 · 2023年3月1日

The In-Sample Softmax for Offline Reinforcement Learning

Arxiv

0+阅读 · 2023年2月28日

On the Role of Emergent Communication for Social Learning in Multi-Agent Reinforcement Learning

Arxiv

0+阅读 · 2023年2月28日

Exposure-Based Multi-Agent Inspection of a Tumbling Target Using Deep Reinforcement Learning

Arxiv

0+阅读 · 2023年2月27日

GANterfactual-RL: Understanding Reinforcement Learning Agents' Strategies through Visual Counterfactual Explanations

Arxiv

0+阅读 · 2023年2月24日

A Survey on Causal Reinforcement Learning

Arxiv

29+阅读 · 2023年2月10日

Deep Learning for Medical Image Segmentation: Tricks, Challenges and Future Directions

Arxiv

21+阅读 · 2022年9月21日

A Survey of Reinforcement Learning Techniques: Strategies, Recent Development, and Future Directions

A Survey of Reinforcement Learning Techniques: Strategies, Recent Development, and Future Directions

Arxiv

80+阅读 · 2020年1月19日

相关基金

pH响应γ-聚谷氨酸靶向纳米载药体系用于增强抗肿瘤效应及逆转乳腺癌多药耐药的机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

单原子层贵金属修饰的钯基核壳结构纳米粒子的可控制备及电化学性能

国家自然科学基金

0+阅读 · 2013年12月31日

Mg-Zn-RE(Ce,Nd)系镁合金强化相析出过程与强化机制的研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于"Build-and-Click"法的铂类RNA聚合酶I选择性抑制剂的构建、评价及亚细胞定位研究

国家自然科学基金

1+阅读 · 2013年12月31日

宏观三维多孔类Fenton纳米催化剂的组装制备及其降解有机污染物的增效机制

国家自然科学基金

0+阅读 · 2012年12月31日

用于H2S完全分解制氢的非平衡等离子体-光催化体系构建

国家自然科学基金

0+阅读 · 2011年12月31日

螯合型和钳形六元氮杂环卡宾钯化合物及其对碳-碳成键的催化

国家自然科学基金

0+阅读 · 2011年12月31日

多氯联苯类（PCBs)污染物的辐射降解研究

国家自然科学基金

0+阅读 · 2009年12月31日

有机气体在活性炭上选择性吸附和竞争吸附的微尺度构效关系研究

国家自然科学基金

0+阅读 · 2009年12月31日

sRAGE对缺血/再灌注的心脏保护作用及其机制

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员