模型应该准确吗? (Should Models Be Accurate?) - 专知论文

会员服务 ·

0

回合 · MoDELS · 学成 · 知识 (knowledge) · 学习器 ·

2022 年 5 月 22 日

Should Models Be Accurate?

翻译：模型应该准确吗?

Esra'a Saleh,John D. Martin,Anna Koop,Arash Pourzarabi,Michael Bowling

from arxiv, The 5th Multidisciplinary Conference on Reinforcement Learning and Decision Making ( RLDM 2022 )

Model-based Reinforcement Learning (MBRL) holds promise for data-efficiency by planning with model-generated experience in addition to learning with experience from the environment. However, in complex or changing environments, models in MBRL will inevitably be imperfect, and their detrimental effects on learning can be difficult to mitigate. In this work, we question whether the objective of these models should be the accurate simulation of environment dynamics at all. We focus our investigations on Dyna-style planning in a prediction setting. First, we highlight and support three motivating points: a perfectly accurate model of environment dynamics is not practically achievable, is not necessary, and is not always the most useful anyways. Second, we introduce a meta-learning algorithm for training models with a focus on their usefulness to the learner instead of their accuracy in modelling the environment. Our experiments show that in a simple non-stationary environment, our algorithm enables faster learning than even using an accurate model built with domain-specific knowledge of the non-stationarity.

翻译：以模型为基础的强化学习(MBRL)通过利用模型生成的经验进行规划,除了从环境中学习经验外,还利用模型生成的经验来提高数据效率。然而,在复杂或变化的环境中,模型的模型必然不完善,对学习的有害影响难以减轻。在这项工作中,我们质疑这些模型的目标是否应当是对环境动态的准确模拟。我们在预测环境中把调查的重点放在Dyna风格的规划上。首先,我们强调并支持三个激励点:一个完全准确的环境动态模型实际上无法实现,并不必要,而且总是最有用的。第二,我们为培训模型采用元学习算法,重点是这些模型对学习者有用,而不是在模拟环境中的准确性。我们的实验表明,在简单的非静止环境中,我们的算法能够更快地学习,而不是使用一个精确的模型,该模型以特定领域对非静止性的知识为基础。

0

相关内容

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Latest News & Announcements of the Plenary Talk2

【ICIG2021】Latest News & Announcements of the Plenary Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年11月2日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

Periostin-avβ3-FAK-PI3K通路在褐藻糖胶抗乳腺癌转移中的作用机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

肿瘤抗原HCA587与STAT3的相互作用及其促进肿瘤转移的分子机制研究

国家自然科学基金

1+阅读 · 2014年12月31日

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

人滋养层细胞表面抗原-2调控胆囊癌细胞增殖、侵袭和转移的作用及其机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

分离、鉴定肺癌细胞中与核糖体蛋白L22相互作用的蛋白分子

国家自然科学基金

0+阅读 · 2009年12月31日

狼毒活性成分诱导乳腺癌细胞凋亡的作用及其对PI3K/Akt信号通路的调控

国家自然科学基金

0+阅读 · 2009年12月31日

砷致激素非依赖型乳腺癌细胞雌激素受体α20877;表达的机制

国家自然科学基金

0+阅读 · 2009年12月31日

富含半胱氨酸的酸性分泌蛋白SPARC在胃癌细胞中的表达和调控

国家自然科学基金

0+阅读 · 2009年12月31日

靶向乳腺癌干细胞导入mir-30a抑制乳腺癌转移的研究

国家自然科学基金

0+阅读 · 2009年12月31日

Legumain在乳腺癌骨转移和破骨损伤过程中的作用机制研究

国家自然科学基金

0+阅读 · 2009年12月31日

Arena-Bench: A Benchmarking Suite for Obstacle Avoidance Approaches in Highly Dynamic Environments

Arxiv

0+阅读 · 2022年7月11日

GAN Cocktail: mixing GANs without dataset access

Arxiv

0+阅读 · 2022年7月11日

LM-Nav: Robotic Navigation with Large Pre-Trained Models of Language, Vision, and Action

Arxiv

0+阅读 · 2022年7月10日

Shall we count the living or the dead?

Arxiv

0+阅读 · 2022年7月9日

Learning from Guided Play: A Scheduled Hierarchical Approach for Improving Exploration in Adversarial Imitation Learning

Arxiv

0+阅读 · 2022年7月8日

Should All Proposals be Treated Equally in Object Detection?

Arxiv

0+阅读 · 2022年7月7日

A Case-Study on Variations Observed in Accelerometers Across Devices

A Case-Study on Variations Observed in Accelerometers Across Devices

Arxiv

0+阅读 · 2022年7月7日

Deep Rotation Correction without Angle Prior

Arxiv

0+阅读 · 2022年7月7日

Extreme Language Model Compression with Optimal Subwords and Shared Projections

Extreme Language Model Compression with Optimal Subwords and Shared Projections

Arxiv

18+阅读 · 2019年9月25日

Generating Diverse and Accurate Visual Captions by Comparative Adversarial Learning

Arxiv

10+阅读 · 2018年4月11日

VIP会员

文章信息

相关主题

知识 (knowledge)

相关VIP内容

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《人工智能绝不能完全自主》

《人工智能的法律与伦理：军事自主机器独特挑战的深度剖析》316页

从数据到主导：AI与兵棋推演构筑决策优势

《特洛伊木马货柜：武器化集装箱的战略威胁》最新报告

相关资讯

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Latest News & Announcements of the Plenary Talk2

【ICIG2021】Latest News & Announcements of the Plenary Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年11月2日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

相关论文

Arena-Bench: A Benchmarking Suite for Obstacle Avoidance Approaches in Highly Dynamic Environments

Arxiv

0+阅读 · 2022年7月11日

GAN Cocktail: mixing GANs without dataset access

Arxiv

0+阅读 · 2022年7月11日

LM-Nav: Robotic Navigation with Large Pre-Trained Models of Language, Vision, and Action

Arxiv

0+阅读 · 2022年7月10日

Shall we count the living or the dead?

Arxiv

0+阅读 · 2022年7月9日

Learning from Guided Play: A Scheduled Hierarchical Approach for Improving Exploration in Adversarial Imitation Learning

Arxiv

0+阅读 · 2022年7月8日

Should All Proposals be Treated Equally in Object Detection?

Arxiv

0+阅读 · 2022年7月7日

A Case-Study on Variations Observed in Accelerometers Across Devices

A Case-Study on Variations Observed in Accelerometers Across Devices

Arxiv

0+阅读 · 2022年7月7日

Deep Rotation Correction without Angle Prior

Arxiv

0+阅读 · 2022年7月7日

Extreme Language Model Compression with Optimal Subwords and Shared Projections

Extreme Language Model Compression with Optimal Subwords and Shared Projections

Arxiv

18+阅读 · 2019年9月25日

Generating Diverse and Accurate Visual Captions by Comparative Adversarial Learning

Arxiv

10+阅读 · 2018年4月11日

相关基金

Periostin-avβ3-FAK-PI3K通路在褐藻糖胶抗乳腺癌转移中的作用机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

肿瘤抗原HCA587与STAT3的相互作用及其促进肿瘤转移的分子机制研究

国家自然科学基金

1+阅读 · 2014年12月31日

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

人滋养层细胞表面抗原-2调控胆囊癌细胞增殖、侵袭和转移的作用及其机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

分离、鉴定肺癌细胞中与核糖体蛋白L22相互作用的蛋白分子

国家自然科学基金

0+阅读 · 2009年12月31日

狼毒活性成分诱导乳腺癌细胞凋亡的作用及其对PI3K/Akt信号通路的调控

国家自然科学基金

0+阅读 · 2009年12月31日

砷致激素非依赖型乳腺癌细胞雌激素受体α20877;表达的机制

国家自然科学基金

0+阅读 · 2009年12月31日

富含半胱氨酸的酸性分泌蛋白SPARC在胃癌细胞中的表达和调控

国家自然科学基金

0+阅读 · 2009年12月31日

靶向乳腺癌干细胞导入mir-30a抑制乳腺癌转移的研究

国家自然科学基金

0+阅读 · 2009年12月31日

Legumain在乳腺癌骨转移和破骨损伤过程中的作用机制研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员