强化学习奖励报告 (Reward Reports for Reinforcement Learning) - 专知论文

会员服务 ·

0

学成 · 优化器 · Automator · 强化学习 · 讲稿 ·

2022 年 4 月 22 日

Reward Reports for Reinforcement Learning

翻译：强化学习奖励报告

Thomas Gilbert,Sarah Dean,Nathan Lambert,Tom Zick,Aaron Snoswell

The desire to build good systems in the face of complex societal effects requires a dynamic approach towards equity and access. Recent approaches to machine learning (ML) documentation have demonstrated the promise of discursive frameworks for deliberation about these complexities. However, these developments have been grounded in a static ML paradigm, leaving the role of feedback and post-deployment performance unexamined. Meanwhile, recent work in reinforcement learning design has shown that the effects of optimization objectives on the resultant system behavior can be wide-ranging and unpredictable. In this paper we sketch a framework for documenting deployed learning systems, which we call Reward Reports. Taking inspiration from various contributions to the technical literature on reinforcement learning, we outline Reward Reports as living documents that track updates to design choices and assumptions behind what a particular automated system is optimizing for. They are intended to track dynamic phenomena arising from system deployment, rather than merely static properties of models or data. After presenting the elements of a Reward Report, we provide three examples: DeepMind's MuZero, MovieLens, and a hypothetical deployment of a Project Flow traffic control policy.

翻译：在复杂的社会影响面前建立良好系统的愿望要求以动态方式对待公平和准入问题。最近对机器学习(ML)文件采取的办法表明,有希望为审议这些复杂问题建立不准确的框架。然而,这些发展是建立在静态的ML范式基础上的,使得反馈和部署后业绩的作用没有受到审查。与此同时,最近的强化学习设计工作表明,优化目标对由此产生的系统行为的影响可以是广泛和不可预测的。在这份文件中,我们勾画了一个记录已部署的学习系统的框架,我们称之为Reward报告。从对关于强化学习的技术文献的各种贡献中得到的启发,我们把Reward报告作为活的文件,跟踪设计选择和假设的更新情况,以了解特定自动化系统的最佳用途。它们的目的是跟踪系统部署产生的动态现象,而不仅仅是模型或数据的静态特性。在提出一份奖励报告的内容之后,我们举三个例子:DeepMind's Muzero,MemoLens,以及假设部署项目流量控制政策。

0

相关内容

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

强化学习的对比无监督表示，CURL: Contrastive Unsupervised Representations for Reinforcement Learning

强化学习的对比无监督表示，CURL: Contrastive Unsupervised Representations for Reinforcement Learning

专知会员服务

41+阅读 · 2020年4月11日

【牛津大学】深度残差强化学习，Deep Residual Reinforcement Learning

【牛津大学】深度残差强化学习，Deep Residual Reinforcement Learning

专知会员服务

84+阅读 · 2020年2月18日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Call for Nominations: 2022 Multimedia Prize Paper Award

Call for Nominations: 2022 Multimedia Prize Paper Award

CCF多媒体专委会

0+阅读 · 2022年2月12日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

合金化元素（Ti/Nb/Re）对W中He起泡行为的影响及机理研究

国家自然科学基金

0+阅读 · 2014年12月31日

暖白光LED用低光衰高显色性Lu3Al5-x(Si/B)xO12-yNy:Ce荧光粉的研究

国家自然科学基金

0+阅读 · 2014年12月31日

原子级厚度二维晶体的铁磁性研究

国家自然科学基金

0+阅读 · 2014年12月31日

植物病毒PVY与宿主表观遗传调控的互作研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于稀土离子f-d跃迁的FED新发光材料的设计、合成和光谱性质研究

国家自然科学基金

0+阅读 · 2011年12月31日

藏语依存树库的构建

国家自然科学基金

0+阅读 · 2011年12月31日

基于Decorin基因甲基化调控的非小细胞肺癌转移的分子机制

国家自然科学基金

0+阅读 · 2011年12月31日

白光LED用高光效Re3+:(Y/Gd)3(Al/Ga)5O12荧光晶体的制备及发光性能研究

国家自然科学基金

0+阅读 · 2011年12月31日

新型中红外激光晶体Er3＋:CaReAlO4(Re=Y,Gd)的研究

国家自然科学基金

0+阅读 · 2009年12月31日

LuSiO5:Ce新型透明薄膜微结构的调控及其闪烁性能

国家自然科学基金

0+阅读 · 2009年12月31日

Meta-Reinforcement Learning with Self-Modifying Networks

Meta-Reinforcement Learning with Self-Modifying Networks

Arxiv

1+阅读 · 2022年6月10日

Model-Free $μ$ Synthesis via Adversarial Reinforcement Learning

Arxiv

0+阅读 · 2022年6月8日

Sim2real for Reinforcement Learning Driven Next Generation Networks

Arxiv

1+阅读 · 2022年6月8日

MIX-MAB: Reinforcement Learning-based Resource Allocation Algorithm for LoRaWAN

Arxiv

0+阅读 · 2022年6月7日

Driving in Real Life with Inverse Reinforcement Learning

Arxiv

0+阅读 · 2022年6月7日

Neuro-Nav: A Library for Neurally-Plausible Reinforcement Learning

Arxiv

0+阅读 · 2022年6月6日

Reinforcement Learning on Graph: A Survey

Arxiv

67+阅读 · 2022年4月13日

Curriculum Learning for Reinforcement Learning Domains: A Framework and Survey

Curriculum Learning for Reinforcement Learning Domains: A Framework and Survey

Arxiv

20+阅读 · 2020年3月10日

Deep Reinforcement Learning: An Overview

Deep Reinforcement Learning: An Overview

Arxiv

17+阅读 · 2018年11月26日

A Multi-Objective Deep Reinforcement Learning Framework

A Multi-Objective Deep Reinforcement Learning Framework

Arxiv

16+阅读 · 2018年6月27日

VIP会员

文章信息

相关主题

相关VIP内容

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

强化学习的对比无监督表示，CURL: Contrastive Unsupervised Representations for Reinforcement Learning

强化学习的对比无监督表示，CURL: Contrastive Unsupervised Representations for Reinforcement Learning

专知会员服务

41+阅读 · 2020年4月11日

【牛津大学】深度残差强化学习，Deep Residual Reinforcement Learning

【牛津大学】深度残差强化学习，Deep Residual Reinforcement Learning

专知会员服务

84+阅读 · 2020年2月18日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《物联网（IoT）中的无人机通信高效控制》135页

《在GNSS信号降级环境中利用共识实现无人机集群稳健协调》

中程单向攻击无人机的战略意义：俄乌战争启示

《面向无人机集群的避障动态传感器覆盖算法》最新38页

相关资讯

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Call for Nominations: 2022 Multimedia Prize Paper Award

Call for Nominations: 2022 Multimedia Prize Paper Award

CCF多媒体专委会

0+阅读 · 2022年2月12日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

相关论文

Meta-Reinforcement Learning with Self-Modifying Networks

Meta-Reinforcement Learning with Self-Modifying Networks

Arxiv

1+阅读 · 2022年6月10日

Model-Free $μ$ Synthesis via Adversarial Reinforcement Learning

Arxiv

0+阅读 · 2022年6月8日

Sim2real for Reinforcement Learning Driven Next Generation Networks

Arxiv

1+阅读 · 2022年6月8日

MIX-MAB: Reinforcement Learning-based Resource Allocation Algorithm for LoRaWAN

Arxiv

0+阅读 · 2022年6月7日

Driving in Real Life with Inverse Reinforcement Learning

Arxiv

0+阅读 · 2022年6月7日

Neuro-Nav: A Library for Neurally-Plausible Reinforcement Learning

Arxiv

0+阅读 · 2022年6月6日

Reinforcement Learning on Graph: A Survey

Arxiv

67+阅读 · 2022年4月13日

Curriculum Learning for Reinforcement Learning Domains: A Framework and Survey

Curriculum Learning for Reinforcement Learning Domains: A Framework and Survey

Arxiv

20+阅读 · 2020年3月10日

Deep Reinforcement Learning: An Overview

Deep Reinforcement Learning: An Overview

Arxiv

17+阅读 · 2018年11月26日

A Multi-Objective Deep Reinforcement Learning Framework

A Multi-Objective Deep Reinforcement Learning Framework

Arxiv

16+阅读 · 2018年6月27日

相关基金

合金化元素（Ti/Nb/Re）对W中He起泡行为的影响及机理研究

国家自然科学基金

0+阅读 · 2014年12月31日

暖白光LED用低光衰高显色性Lu3Al5-x(Si/B)xO12-yNy:Ce荧光粉的研究

国家自然科学基金

0+阅读 · 2014年12月31日

原子级厚度二维晶体的铁磁性研究

国家自然科学基金

0+阅读 · 2014年12月31日

植物病毒PVY与宿主表观遗传调控的互作研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于稀土离子f-d跃迁的FED新发光材料的设计、合成和光谱性质研究

国家自然科学基金

0+阅读 · 2011年12月31日

藏语依存树库的构建

国家自然科学基金

0+阅读 · 2011年12月31日

基于Decorin基因甲基化调控的非小细胞肺癌转移的分子机制

国家自然科学基金

0+阅读 · 2011年12月31日

白光LED用高光效Re3+:(Y/Gd)3(Al/Ga)5O12荧光晶体的制备及发光性能研究

国家自然科学基金

0+阅读 · 2011年12月31日

新型中红外激光晶体Er3＋:CaReAlO4(Re=Y,Gd)的研究

国家自然科学基金

0+阅读 · 2009年12月31日

LuSiO5:Ce新型透明薄膜微结构的调控及其闪烁性能

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员