推断人类偏好时人类学习会计核算 (Accounting for Human Learning when Inferring Human Preferences) - 专知论文

会员服务 ·

0

推断 · 学成 · 回合 · 平稳的 · 逆强化学习 ·

2020 年 12 月 1 日

Accounting for Human Learning when Inferring Human Preferences

翻译：推断人类偏好时人类学习会计核算

Harry Giles,Lawrence Chan

from arxiv, Accepted to the 2020 NeurIPS HAMLETS workshop

Inverse reinforcement learning (IRL) is a common technique for inferring human preferences from data. Standard IRL techniques tend to assume that the human demonstrator is stationary, that is that their policy $\pi$ doesn't change over time. In practice, humans interacting with a novel environment or performing well on a novel task will change their demonstrations as they learn more about the environment or task. We investigate the consequences of relaxing this assumption of stationarity, in particular by modelling the human as learning. Surprisingly, we find in some small examples that this can lead to better inference than if the human was stationary. That is, by observing a demonstrator who is themselves learning, a machine can infer more than by observing a demonstrator who is noisily rational. In addition, we find evidence that misspecification can lead to poor inference, suggesting that modelling human learning is important, especially when the human is facing an unfamiliar environment.

翻译：反强化学习(IRL)是从数据中推断人类偏好的一种常见方法。标准的IRL技术往往假设人类示范器是固定不变的,即其政策$\pi$不会随时间变化。在实践中,人类与新环境互动或在新任务上表现良好会随着他们更多地了解环境或任务而改变其演示。我们调查了放松这种关于常态的假设的后果,特别是通过模拟人类作为学习。令人惊讶的是,我们在一些小例子中发现,这可能导致比人类是固定的更好的推断。也就是说,通过观察一个自我学习的示范器,机器可以比观察一个无常理性的示范器来推断更多。此外,我们发现有证据表明,定点错误可能导致错误的推断,表明模拟人类学习很重要,特别是当人类面临陌生的环境时。

0

相关内容

【DeepMind】基于模型的强化学习，174页ppt，Model-Based Reinforcement Learning

【DeepMind】基于模型的强化学习，174页ppt，Model-Based Reinforcement Learning

专知会员服务

89+阅读 · 2021年1月12日

【ACL2020】DeeBERT:动态加速BERT推理，DeeBERT: Dynamic Early Exiting for Accelerating BERT Inference

【ACL2020】DeeBERT:动态加速BERT推理，DeeBERT: Dynamic Early Exiting for Accelerating BERT Inference

专知会员服务

21+阅读 · 2020年4月30日

因果图，Causal Graphs，52页ppt

因果图，Causal Graphs，52页ppt

专知会员服务

250+阅读 · 2020年4月19日

深度强化学习策略梯度教程，53页ppt

深度强化学习策略梯度教程，53页ppt

专知会员服务

184+阅读 · 2020年2月1日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

53+阅读 · 2019年9月29日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

已删除

将门创投

7+阅读 · 2019年3月28日

Mathematical foundations of moral preferences

Arxiv

0+阅读 · 2021年1月20日

Environmental Pollution Prediction of NOx by Process Analysis and Predictive Modelling in Natural Gas Turbine Power Plants

Environmental Pollution Prediction of NOx by Process Analysis and Predictive Modelling in Natural Gas Turbine Power Plants

Arxiv

0+阅读 · 2021年1月18日

Accounting for Unobserved Confounding in Domain Generalization

Arxiv

0+阅读 · 2021年1月18日

Transferring model structure in Bayesian transfer learning for Gaussian process regression

Arxiv

0+阅读 · 2021年1月18日

Deep Multi-Task Learning for Joint Localization, Perception, and Prediction

Arxiv

0+阅读 · 2021年1月17日

Privacy-Preserving Learning of Human Activity Predictors in Smart Environments

Arxiv

0+阅读 · 2021年1月17日

Inferred successor maps for better transfer learning

Inferred successor maps for better transfer learning

Arxiv

3+阅读 · 2019年7月2日

Unifying Knowledge Graph Learning and Recommendation: Towards a Better Understanding of User Preferences

Arxiv

6+阅读 · 2019年2月17日

Reward learning from human preferences and demonstrations in Atari

Arxiv

8+阅读 · 2018年11月15日

Visual Reinforcement Learning with Imagined Goals

Arxiv

8+阅读 · 2018年7月12日

VIP会员

文章信息

相关主题

逆强化学习

相关VIP内容

【DeepMind】基于模型的强化学习，174页ppt，Model-Based Reinforcement Learning

【DeepMind】基于模型的强化学习，174页ppt，Model-Based Reinforcement Learning

专知会员服务

89+阅读 · 2021年1月12日

【ACL2020】DeeBERT:动态加速BERT推理，DeeBERT: Dynamic Early Exiting for Accelerating BERT Inference

【ACL2020】DeeBERT:动态加速BERT推理，DeeBERT: Dynamic Early Exiting for Accelerating BERT Inference

专知会员服务

21+阅读 · 2020年4月30日

因果图，Causal Graphs，52页ppt

因果图，Causal Graphs，52页ppt

专知会员服务

250+阅读 · 2020年4月19日

深度强化学习策略梯度教程，53页ppt

深度强化学习策略梯度教程，53页ppt

专知会员服务

184+阅读 · 2020年2月1日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

53+阅读 · 2019年9月29日

热门VIP内容

开通专知VIP会员享更多权益服务

新书册《几何深度学习的数学基础》

中程单向攻击无人机的战略意义：俄乌战争启示

在无标注条件下适配视觉—语言模型：全面综述

面向视觉语言模型的持续学习：遗忘之外的综述与分类体系

相关资讯

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

已删除

将门创投

7+阅读 · 2019年3月28日

相关论文

Mathematical foundations of moral preferences

Arxiv

0+阅读 · 2021年1月20日

Environmental Pollution Prediction of NOx by Process Analysis and Predictive Modelling in Natural Gas Turbine Power Plants

Environmental Pollution Prediction of NOx by Process Analysis and Predictive Modelling in Natural Gas Turbine Power Plants

Arxiv

0+阅读 · 2021年1月18日

Accounting for Unobserved Confounding in Domain Generalization

Arxiv

0+阅读 · 2021年1月18日

Transferring model structure in Bayesian transfer learning for Gaussian process regression

Arxiv

0+阅读 · 2021年1月18日

Deep Multi-Task Learning for Joint Localization, Perception, and Prediction

Arxiv

0+阅读 · 2021年1月17日

Privacy-Preserving Learning of Human Activity Predictors in Smart Environments

Arxiv

0+阅读 · 2021年1月17日

Inferred successor maps for better transfer learning

Inferred successor maps for better transfer learning

Arxiv

3+阅读 · 2019年7月2日

Unifying Knowledge Graph Learning and Recommendation: Towards a Better Understanding of User Preferences

Arxiv

6+阅读 · 2019年2月17日

Reward learning from human preferences and demonstrations in Atari

Arxiv

8+阅读 · 2018年11月15日

Visual Reinforcement Learning with Imagined Goals

Arxiv

8+阅读 · 2018年7月12日

微信扫码咨询专知VIP会员