em算法指的是最大期望算法(Expectation Maximization Algorithm,又译期望最大化算法),是一种迭代算法,用于含有隐变量(latent variable)的概率参数模型的最大似然估计或极大后验概率估计。

VIP内容

题目

通过元学习的贝叶斯自适应深度RL, VariBAD: A Very Good Method for Bayes-Adaptive Deep RL via Meta-Learning

关键字

元学习,变分推理,贝叶斯推理,最大期望,强化学习,深度学习,人工智能

简介

在未知环境中权衡探索和开发是最大程度地提高学习过程中预期回报的关键。 一种贝叶斯最优策略,它以最佳方式运行,不仅取决于环境状态,还取决于主体对环境的不确定性,决定其行动。 但是,除了最小的任务外,计算贝叶斯最佳策略是很困难的。 在本文中,我们介绍了变分贝叶斯自适应深度RL(variBAD),这是一种在未知环境中进行元学习以进行近似推理的方法,并直接在动作选择过程中合并任务不确定性。 在网格世界中,我们说明variBAD如何根据任务不确定性执行结构化的在线探索。 我们还评估了在meta-RL中广泛使用的MuJoCo域上的variBAD,并表明与现有方法相比,它在训练过程中获得了更高的回报。

作者

Luisa Zintgraf, Kyriacos Shiarlis, Maximilian Igl, Sebastian Schulze, Yarin Gal, Katja Hofmann, Shimon Whiteson

成为VIP会员查看完整内容
0
15

最新论文

Survival analysis is a challenging variation of regression modeling because of the presence of censoring, where the outcome measurement is only partially known, due to, for example, loss to follow up. Such problems come up frequently in medical applications, making survival analysis a key endeavor in biostatistics and machine learning for healthcare, with Cox regression models being amongst the most commonly employed models. We describe a new approach for survival analysis regression models, based on learning mixtures of Cox regressions to model individual survival distributions. We propose an approximation to the Expectation Maximization algorithm for this model that does hard assignments to mixture groups to make optimization efficient. In each group assignment, we fit the hazard ratios within each group using deep neural networks, and the baseline hazard for each mixture component non-parametrically. We perform experiments on multiple real world datasets, and look at the mortality rates of patients across ethnicity and gender. We emphasize the importance of calibration in healthcare settings and demonstrate that our approach outperforms classical and modern survival analysis baselines, both in terms of discriminative performance and calibration, with large gains in performance on the minority demographics.

0
0
下载
预览
参考链接
Top