CARMS: 分类-反异电-REINFORCE 多模型梯度梯度模拟器 (CARMS: Categorical-Antithetic-REINFORCE Multi-Sample Gradient Estimator) - 专知论文

会员服务 ·

0

估计/估计量 · 无偏 · 负相关法 · 可约的 · 无偏估计 ·

2021 年 10 月 26 日

CARMS: Categorical-Antithetic-REINFORCE Multi-Sample Gradient Estimator

翻译：CARMS: 分类-反异电-REINFORCE 多模型梯度梯度模拟器

Alek Dimitriev,Mingyuan Zhou

Accurately backpropagating the gradient through categorical variables is a challenging task that arises in various domains, such as training discrete latent variable models. To this end, we propose CARMS, an unbiased estimator for categorical random variables based on multiple mutually negatively correlated (jointly antithetic) samples. CARMS combines REINFORCE with copula based sampling to avoid duplicate samples and reduce its variance, while keeping the estimator unbiased using importance sampling. It generalizes both the ARMS antithetic estimator for binary variables, which is CARMS for two categories, as well as LOORF/VarGrad, the leave-one-out REINFORCE estimator, which is CARMS with independent samples. We evaluate CARMS on several benchmark datasets on a generative modeling task, as well as a structured output prediction task, and find it to outperform competing methods including a strong self-control baseline. The code is publicly available.

翻译：通过绝对变量准确反向地对梯度进行直截了当的剖析是一项挑战性任务,它出现在不同领域,例如培训离散潜伏变量模型。为此,我们建议CARMS(CARMS),这是基于多个相互负相关(联合抗药性)样本的绝对随机变量的无偏向的估算器。CARIM(CARIM)将REINFORCE与基于相交样本的样本结合,以避免重复样本并减少其差异,同时使用重要取样使估计器保持公正性。它概括了二元变量的AARMS抗遗传测量器,即两个类别的CARMS(CARMS),以及LORF/VarGrad(使用独立样本的LOORF/VarGrad) REINFORCE测算器。我们评估CARIM(CARIM)的数个基准数据集,以基因化模型任务为基础,以及结构化输出预测任务,并发现它超越了竞争方法,包括强大的自控基线。代码是公开提供的。

0

相关内容

估计/估计量

估计/估计量

不可错过！UIUC最新《统计强化学习》课程！

专知会员服务

54+阅读 · 2020年9月7日

Fariz Darari简明《博弈论Game Theory》介绍，35页ppt

Fariz Darari简明《博弈论Game Theory》介绍，35页ppt

专知会员服务

112+阅读 · 2020年5月15日

因果图，Causal Graphs，52页ppt

因果图，Causal Graphs，52页ppt

专知会员服务

252+阅读 · 2020年4月19日

深度强化学习策略梯度教程，53页ppt

深度强化学习策略梯度教程，53页ppt

专知会员服务

184+阅读 · 2020年2月1日

【斯坦福大学】Gradient Surgery for Multi-Task Learning

【斯坦福大学】Gradient Surgery for Multi-Task Learning

专知会员服务

47+阅读 · 2020年1月23日

【CGAN论文笔记强烈推荐】基于CGAN的人脸深度图估计： Face Depth Estimation With Conditional Generative Adversarial Networks

专知会员服务

24+阅读 · 2020年1月8日

强化学习最优表示的几何视角（A Geometric Perspective on Optimal Representations for Reinforcement Learning）

强化学习最优表示的几何视角（A Geometric Perspective on Optimal Representations for Reinforcement Learning）

专知会员服务

9+阅读 · 2019年12月24日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

31+阅读 · 2019年10月17日

TensorFlow 2.0 学习资源汇总

TensorFlow 2.0 学习资源汇总

专知会员服务

67+阅读 · 2019年10月9日

分布式并行架构Ray介绍

分布式并行架构Ray介绍

CreateAMind

10+阅读 · 2019年8月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

春节充电系列：李宏毅2017机器学习课程学习笔记24之结构化学习-Structured SVM（part 2）

春节充电系列：李宏毅2017机器学习课程学习笔记24之结构化学习-Structured SVM（part 2）

专知

4+阅读 · 2018年3月10日

分布式TensorFlow入门指南

分布式TensorFlow入门指南

机器学习研究会

4+阅读 · 2017年11月28日

【推荐】TensorFlow手把手CNN实践指南

【推荐】TensorFlow手把手CNN实践指南

机器学习研究会

5+阅读 · 2017年8月17日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

强化学习 cartpole_a3c

强化学习 cartpole_a3c

CreateAMind

9+阅读 · 2017年7月21日

Estimation based on nearest neighbor matching: from density ratio to average treatment effect

Arxiv

0+阅读 · 2021年12月27日

Sample-Efficient Reinforcement Learning for Linearly-Parameterized MDPs with a Generative Model

Arxiv

0+阅读 · 2021年12月23日

HuMoR: 3D Human Motion Model for Robust Pose Estimation

Arxiv

3+阅读 · 2021年5月10日

Stochastic Shortest Path: Minimax, Parameter-Free and Towards Horizon-Free Regret

Arxiv

8+阅读 · 2021年4月22日

Deep Structured Prediction with Nonlinear Output Transformations

Arxiv

4+阅读 · 2018年11月1日

Coarse-to-fine Seam Estimation for Image Stitching

Arxiv

4+阅读 · 2018年5月24日

Feasibility Based Large Margin Nearest Neighbor Metric Learning

Arxiv

3+阅读 · 2018年5月2日

Entire Space Multi-Task Model: An Effective Approach for Estimating Post-Click Conversion Rate

Arxiv

7+阅读 · 2018年4月24日

Fine-Grained Head Pose Estimation Without Keypoints

Arxiv

5+阅读 · 2018年4月13日

Positive-Unlabeled Learning with Non-Negative Risk Estimator

Arxiv

3+阅读 · 2017年11月4日

VIP会员

文章信息

相关主题

估计/估计量

相关VIP内容

不可错过！UIUC最新《统计强化学习》课程！

专知会员服务

54+阅读 · 2020年9月7日

Fariz Darari简明《博弈论Game Theory》介绍，35页ppt

Fariz Darari简明《博弈论Game Theory》介绍，35页ppt

专知会员服务

112+阅读 · 2020年5月15日

因果图，Causal Graphs，52页ppt

因果图，Causal Graphs，52页ppt

专知会员服务

252+阅读 · 2020年4月19日

深度强化学习策略梯度教程，53页ppt

深度强化学习策略梯度教程，53页ppt

专知会员服务

184+阅读 · 2020年2月1日

【斯坦福大学】Gradient Surgery for Multi-Task Learning

【斯坦福大学】Gradient Surgery for Multi-Task Learning

专知会员服务

47+阅读 · 2020年1月23日

【CGAN论文笔记强烈推荐】基于CGAN的人脸深度图估计： Face Depth Estimation With Conditional Generative Adversarial Networks

专知会员服务

24+阅读 · 2020年1月8日

强化学习最优表示的几何视角（A Geometric Perspective on Optimal Representations for Reinforcement Learning）

强化学习最优表示的几何视角（A Geometric Perspective on Optimal Representations for Reinforcement Learning）

专知会员服务

9+阅读 · 2019年12月24日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

31+阅读 · 2019年10月17日

TensorFlow 2.0 学习资源汇总

TensorFlow 2.0 学习资源汇总

专知会员服务

67+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《利用模型系统工程软件测试技术增强作战飞行试验研究》最新213页

【CMU博士论文】在学习与推理中融入搜索

《运用深度学习技术检测卫星异常活动以预警军事行动迹象》

《无人机通信作战效能评估：军用环境下保密无人机通信平台的仿真验证》

相关资讯

分布式并行架构Ray介绍

分布式并行架构Ray介绍

CreateAMind

10+阅读 · 2019年8月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

春节充电系列：李宏毅2017机器学习课程学习笔记24之结构化学习-Structured SVM（part 2）

春节充电系列：李宏毅2017机器学习课程学习笔记24之结构化学习-Structured SVM（part 2）

专知

4+阅读 · 2018年3月10日

分布式TensorFlow入门指南

分布式TensorFlow入门指南

机器学习研究会

4+阅读 · 2017年11月28日

【推荐】TensorFlow手把手CNN实践指南

【推荐】TensorFlow手把手CNN实践指南

机器学习研究会

5+阅读 · 2017年8月17日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

强化学习 cartpole_a3c

强化学习 cartpole_a3c

CreateAMind

9+阅读 · 2017年7月21日

相关论文

Estimation based on nearest neighbor matching: from density ratio to average treatment effect

Arxiv

0+阅读 · 2021年12月27日

Sample-Efficient Reinforcement Learning for Linearly-Parameterized MDPs with a Generative Model

Arxiv

0+阅读 · 2021年12月23日

HuMoR: 3D Human Motion Model for Robust Pose Estimation

Arxiv

3+阅读 · 2021年5月10日

Stochastic Shortest Path: Minimax, Parameter-Free and Towards Horizon-Free Regret

Arxiv

8+阅读 · 2021年4月22日

Deep Structured Prediction with Nonlinear Output Transformations

Arxiv

4+阅读 · 2018年11月1日

Coarse-to-fine Seam Estimation for Image Stitching

Arxiv

4+阅读 · 2018年5月24日

Feasibility Based Large Margin Nearest Neighbor Metric Learning

Arxiv

3+阅读 · 2018年5月2日

Entire Space Multi-Task Model: An Effective Approach for Estimating Post-Click Conversion Rate

Arxiv

7+阅读 · 2018年4月24日

Fine-Grained Head Pose Estimation Without Keypoints

Arxiv

5+阅读 · 2018年4月13日

Positive-Unlabeled Learning with Non-Negative Risk Estimator

Arxiv

3+阅读 · 2017年11月4日

微信扫码咨询专知VIP会员