Experience Filter: Using Past Experiences on Unseen Tasks or Environments - 专知论文

会员服务 ·

0

回合 · 相关系数 · Learning · Agent · 部分可观测马尔可夫决策过程 ·

2023 年 5 月 29 日

Experience Filter: Using Past Experiences on Unseen Tasks or Environments

翻译：暂无翻译

Anil Yildiz,Esen Yel,Anthony L. Corso,Kyle H. Wray,Stefan J. Witwicki,Mykel J. Kochenderfer

from arxiv, Accepted at IEEE Intelligent Vehicles Symposium (IV) 2023

One of the bottlenecks of training autonomous vehicle (AV) agents is the variability of training environments. Since learning optimal policies for unseen environments is often very costly and requires substantial data collection, it becomes computationally intractable to train the agent on every possible environment or task the AV may encounter. This paper introduces a zero-shot filtering approach to interpolate learned policies of past experiences to generalize to unseen ones. We use an experience kernel to correlate environments. These correlations are then exploited to produce policies for new tasks or environments from learned policies. We demonstrate our methods on an autonomous vehicle driving through T-intersections with different characteristics, where its behavior is modeled as a partially observable Markov decision process (POMDP). We first construct compact representations of learned policies for POMDPs with unknown transition functions given a dataset of sequential actions and observations. Then, we filter parameterized policies of previously visited environments to generate policies to new, unseen environments. We demonstrate our approaches on both an actual AV and a high-fidelity simulator. Results indicate that our experience filter offers a fast, low-effort, and near-optimal solution to create policies for tasks or environments never seen before. Furthermore, the generated new policies outperform the policy learned using the entire data collected from past environments, suggesting that the correlation among different environments can be exploited and irrelevant ones can be filtered out.

翻译：暂无翻译

0

相关内容

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

白三烯B4在糖尿病诱导的内皮细胞功能障碍中的作用及其机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

OSMR在糖尿病心肌病中的作用和机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

自旋轨道耦合玻色凝聚体的拓扑量子态和量子动力学性质

国家自然科学基金

0+阅读 · 2014年12月31日

c-Myc诱导的长链非编码RNA AFAP1-AS1促进NSCLC细胞增殖机制的初步研究

国家自然科学基金

0+阅读 · 2013年12月31日

Legumain在乳腺癌骨转移和破骨损伤过程中的作用机制研究

国家自然科学基金

0+阅读 · 2009年12月31日

Hierarchically Composing Level Generators for the Creation of Complex Structures

Arxiv

0+阅读 · 2023年7月19日

Learning from Pixels with Expert Observations

Arxiv

0+阅读 · 2023年7月15日

Hypothesis Transfer Learning with Surrogate Classification Losses: Generalization Bounds through Algorithmic Stability

Arxiv

0+阅读 · 2023年7月14日

Reinforcement Learning with Frontier-Based Exploration via Autonomous Environment

Arxiv

0+阅读 · 2023年7月14日

Leveraging Factored Action Spaces for Off-Policy Evaluation

Arxiv

0+阅读 · 2023年7月13日

VIP会员

文章信息

相关主题

部分可观测马尔可夫决策过程

相关VIP内容

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【斯坦福博士论文】数据、决策与过度依赖：构建可信人工智能的核心挑战

《多域时代中维持弹性军事训练：挑战与机遇》

【AAAI2026】专家数量何为最优？面向混合专家模型的语义专业化优化研究

自进化人工智能体的全面综述：连接基础模型与终身自主智能系统的新范式

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

相关论文

Hierarchically Composing Level Generators for the Creation of Complex Structures

Arxiv

0+阅读 · 2023年7月19日

Learning from Pixels with Expert Observations

Arxiv

0+阅读 · 2023年7月15日

Hypothesis Transfer Learning with Surrogate Classification Losses: Generalization Bounds through Algorithmic Stability

Arxiv

0+阅读 · 2023年7月14日

Reinforcement Learning with Frontier-Based Exploration via Autonomous Environment

Arxiv

0+阅读 · 2023年7月14日

Leveraging Factored Action Spaces for Off-Policy Evaluation

Arxiv

0+阅读 · 2023年7月13日

相关基金

白三烯B4在糖尿病诱导的内皮细胞功能障碍中的作用及其机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

OSMR在糖尿病心肌病中的作用和机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

自旋轨道耦合玻色凝聚体的拓扑量子态和量子动力学性质

国家自然科学基金

0+阅读 · 2014年12月31日

c-Myc诱导的长链非编码RNA AFAP1-AS1促进NSCLC细胞增殖机制的初步研究

国家自然科学基金

0+阅读 · 2013年12月31日

Legumain在乳腺癌骨转移和破骨损伤过程中的作用机制研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员