通过软模轴近似法对最佳动态政策的推论</s> (Inference on Optimal Dynamic Policies via Softmax Approximation) - 专知论文

会员服务 ·

0

优化器 · 推断 · Softmax · 近似 · 估计/估计量 ·

2023 年 3 月 8 日

Inference on Optimal Dynamic Policies via Softmax Approximation

翻译：通过软模轴近似法对最佳动态政策的推论

Qizhao Chen,Morgane Austern,Vasilis Syrgkanis

Estimating optimal dynamic policies from offline data is a fundamental problem in dynamic decision making. In the context of causal inference, the problem is known as estimating the optimal dynamic treatment regime. Even though there exists a plethora of methods for estimation, constructing confidence intervals for the value of the optimal regime and structural parameters associated with it is inherently harder, as it involves non-linear and non-differentiable functionals of un-known quantities that need to be estimated. Prior work resorted to sub-sample approaches that can deteriorate the quality of the estimate. We show that a simple soft-max approximation to the optimal treatment regime, for an appropriately fast growing temperature parameter, can achieve valid inference on the truly optimal regime. We illustrate our result for a two-period optimal dynamic regime, though our approach should directly extend to the finite horizon case. Our work combines techniques from semi-parametric inference and $g$-estimation, together with an appropriate triangular array central limit theorem, as well as a novel analysis of the asymptotic influence and asymptotic bias of softmax approximations.

翻译：从离线数据中估计最佳动态政策是动态决策的根本问题。在因果推断方面,问题被称为估算最佳动态处理机制。尽管存在大量估算方法,但为最佳制度的价值和与之相关的结构性参数构建信任间隔具有固有的难度,因为这涉及非线性和不可区分的、数量不明的、需要估算的功能。先前的工作采用次抽样方法,可能会降低估计质量。我们表明,对最佳处理机制进行简单的软式近似,对于适当快速增长的温度参数来说,可以实现对真正最佳制度的有效推断。我们展示了两期最佳动态制度的结果,尽管我们的方法应该直接延伸到有限地平线。我们的工作结合了半偏差推论和以美元计估算的技术,同时结合了适当的三角阵列核心定理学,以及软质近似近似的影响和偏差的新分析。</s>

0

相关内容

优化器

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

专知会员服务

246+阅读 · 2019年10月21日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

可解释的CNN

可解释的CNN

CreateAMind

17+阅读 · 2017年10月5日

带变动指标集的非光滑半无限优化问题的最优性条件研究

国家自然科学基金

0+阅读 · 2015年12月31日

基于对称识别方法的贝叶斯probit模型稳健性研究

国家自然科学基金

3+阅读 · 2015年12月31日

线粒体TRAP1分子介导Ago2蛋白表达在肠癌转移中的作用机制

国家自然科学基金

0+阅读 · 2014年12月31日

关于面板(纵向）数据的动态统计分析

国家自然科学基金

0+阅读 · 2014年12月31日

复合材料结构分析中的辛有限元分形研究

国家自然科学基金

0+阅读 · 2014年12月31日

新型靶向纳米药物Angio-Ag-NRPMAb的制备及其抑制胶质瘤的机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

FOXO1在TNFR-Fc抑制急性肺损伤肺泡上皮细胞凋亡中的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

microRNA调节肿瘤抑制因子Caliban应答DNA损伤的机制

国家自然科学基金

1+阅读 · 2012年12月31日

函数域中的Vinogradov中值定理

国家自然科学基金

0+阅读 · 2012年12月31日

Vif为靶点的抗HIV-1药物先导结构的发现与优化

国家自然科学基金

0+阅读 · 2009年12月31日

Adaptivity and Non-stationarity: Problem-dependent Dynamic Regret for Online Convex Optimization

Arxiv

0+阅读 · 2023年5月1日

Generalized Bayesian Likelihood-Free Inference

Arxiv

0+阅读 · 2023年4月30日

A Sparse Expansion For Deep Gaussian Processes

Arxiv

0+阅读 · 2023年4月29日

Estimation and Inference for Minimizer and Minimum of Convex Functions: Optimality, Adaptivity, and Uncertainty Principles

Arxiv

0+阅读 · 2023年4月29日

Faster Submodular Maximization for Several Classes of Matroids

Arxiv

0+阅读 · 2023年4月28日

Spectral Sparsification for Communication-Efficient Collaborative Rotation and Translation Estimation

Arxiv

0+阅读 · 2023年4月28日

On the power of standard information for tractability for $L_\infty$ approximation of periodic functions in the worst case setting

Arxiv

0+阅读 · 2023年4月28日

Convexity Not Required: Estimation of Smooth Moment Condition Models

Arxiv

0+阅读 · 2023年4月27日

Interpreting Primal-Dual Algorithms for Constrained Multiagent Reinforcement Learning

Arxiv

0+阅读 · 2023年4月26日

IEOPF: An Active Contour Model for Image Segmentation with Inhomogeneities Estimated by Orthogonal Primary Functions

Arxiv

10+阅读 · 2018年1月20日

VIP会员

文章信息

相关主题

估计/估计量

相关VIP内容

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

专知会员服务

246+阅读 · 2019年10月21日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【CMU博士论文】基础模型训练中网络规模数据的负责任与高效使用

《俄乌战争背景下俄罗斯的战略性海军分析（2022-2025年）》最新100页报告

人工智能时代背景下的未来海战

相关资讯

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

可解释的CNN

可解释的CNN

CreateAMind

17+阅读 · 2017年10月5日

相关论文

Adaptivity and Non-stationarity: Problem-dependent Dynamic Regret for Online Convex Optimization

Arxiv

0+阅读 · 2023年5月1日

Generalized Bayesian Likelihood-Free Inference

Arxiv

0+阅读 · 2023年4月30日

A Sparse Expansion For Deep Gaussian Processes

Arxiv

0+阅读 · 2023年4月29日

Estimation and Inference for Minimizer and Minimum of Convex Functions: Optimality, Adaptivity, and Uncertainty Principles

Arxiv

0+阅读 · 2023年4月29日

Faster Submodular Maximization for Several Classes of Matroids

Arxiv

0+阅读 · 2023年4月28日

Spectral Sparsification for Communication-Efficient Collaborative Rotation and Translation Estimation

Arxiv

0+阅读 · 2023年4月28日

On the power of standard information for tractability for $L_\infty$ approximation of periodic functions in the worst case setting

Arxiv

0+阅读 · 2023年4月28日

Convexity Not Required: Estimation of Smooth Moment Condition Models

Arxiv

0+阅读 · 2023年4月27日

Interpreting Primal-Dual Algorithms for Constrained Multiagent Reinforcement Learning

Arxiv

0+阅读 · 2023年4月26日

IEOPF: An Active Contour Model for Image Segmentation with Inhomogeneities Estimated by Orthogonal Primary Functions

Arxiv

10+阅读 · 2018年1月20日

相关基金

带变动指标集的非光滑半无限优化问题的最优性条件研究

国家自然科学基金

0+阅读 · 2015年12月31日

基于对称识别方法的贝叶斯probit模型稳健性研究

国家自然科学基金

3+阅读 · 2015年12月31日

线粒体TRAP1分子介导Ago2蛋白表达在肠癌转移中的作用机制

国家自然科学基金

0+阅读 · 2014年12月31日

关于面板(纵向）数据的动态统计分析

国家自然科学基金

0+阅读 · 2014年12月31日

复合材料结构分析中的辛有限元分形研究

国家自然科学基金

0+阅读 · 2014年12月31日

新型靶向纳米药物Angio-Ag-NRPMAb的制备及其抑制胶质瘤的机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

FOXO1在TNFR-Fc抑制急性肺损伤肺泡上皮细胞凋亡中的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

microRNA调节肿瘤抑制因子Caliban应答DNA损伤的机制

国家自然科学基金

1+阅读 · 2012年12月31日

函数域中的Vinogradov中值定理

国家自然科学基金

0+阅读 · 2012年12月31日

Vif为靶点的抗HIV-1药物先导结构的发现与优化

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员