通过 " 预测在线观光人适应 ",在非静止环境中进行自我发展驾驶</s> (Self-Adaptive Driving in Nonstationary Environments through Conjectural Online Lookahead Adaptation) - 专知论文

会员服务 ·

0

在线 · Learning · 回合 · Agent · Performer ·

2023 年 3 月 8 日

Self-Adaptive Driving in Nonstationary Environments through Conjectural Online Lookahead Adaptation

翻译：通过 " 预测在线观光人适应 ",在非静止环境中进行自我发展驾驶

Tao Li,Haozhe Lei,Quanyan Zhu

from arxiv, 10 pages with appendices; Nov 2 update: fixed minor typos in the appendices; Mar 2 update: added funding information, remade some figures

Powered by deep representation learning, reinforcement learning (RL) provides an end-to-end learning framework capable of solving self-driving (SD) tasks without manual designs. However, time-varying nonstationary environments cause proficient but specialized RL policies to fail at execution time. For example, an RL-based SD policy trained under sunny days does not generalize well to rainy weather. Even though meta learning enables the RL agent to adapt to new tasks/environments, its offline operation fails to equip the agent with online adaptation ability when facing nonstationary environments. This work proposes an online meta reinforcement learning algorithm based on the \emph{conjectural online lookahead adaptation} (COLA). COLA determines the online adaptation at every step by maximizing the agent's conjecture of the future performance in a lookahead horizon. Experimental results demonstrate that under dynamically changing weather and lighting conditions, the COLA-based self-adaptive driving outperforms the baseline policies in terms of online adaptability. A demo video, source code, and appendixes are available at {\tt https://github.com/Panshark/COLA}

翻译：强化学习(RL)以深层代表性学习为动力,强化学习(RL)提供了一个端到端学习框架,能够在没有手工设计的情况下解决自我驾驶(SD)任务。然而,时间变化的非静止环境导致执行时精巧但专业化的RL政策失败。例如,在阳光明媚的日子里培训的基于RL的SD政策没有概括到雨季的天气。即使元学习使RL代理能够适应新的任务/环境,但其离线操作未能使代理在面对非静止环境时具备在线适应能力。这项工作提出了基于\emph{conjectur在线外观适应}(COLA)的在线元强化学习算法(COLA)。COLA决定了每个步骤的在线适应,最大限度地利用代理商对未来在外观前景中的性能的预测。实验结果表明,在动态变化的天气和照明条件下,基于COLA的自我适应驱动器在面临非静止环境时,无法使该代理商具备在线适应性基准政策。在 &-Atas/Agus/Givth.</s>

0

相关内容

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

Meta最新WWW2022《联邦计算导论》教程，附77页ppt

Meta最新WWW2022《联邦计算导论》教程，附77页ppt

专知会员服务

60+阅读 · 2022年5月5日

“CVPR 2021 接受论文列表 1663篇论文都在这了

专知会员服务

32+阅读 · 2021年6月12日

近期必读的6篇CVPR 2020【域自适应（Domain Adaptation）】相关论文和代码

近期必读的6篇CVPR 2020【域自适应（Domain Adaptation）】相关论文和代码

专知会员服务

96+阅读 · 2020年3月24日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知

133+阅读 · 2020年3月18日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【代码资源】GAN | 七份最热GAN文章及代码分享（Github 1000+Stars）

【代码资源】GAN | 七份最热GAN文章及代码分享（Github 1000+Stars）

专知

13+阅读 · 2018年6月24日

最新5篇生成对抗网络相关论文推荐—FusedGAN、DeblurGAN、AdvGAN、CipherGAN、MMD GANS

最新5篇生成对抗网络相关论文推荐—FusedGAN、DeblurGAN、AdvGAN、CipherGAN、MMD GANS

专知

23+阅读 · 2018年1月18日

MoCoGAN 分解运动和内容的视频生成

MoCoGAN 分解运动和内容的视频生成

CreateAMind

18+阅读 · 2017年10月21日

一类离散Hindmarsh-Rose模型的分支延拓

国家自然科学基金

0+阅读 · 2015年12月31日

早期阿尔茨海默病胼胝体与海马的时空特征研究

国家自然科学基金

0+阅读 · 2015年12月31日

基于自主学习的Ad hoc Agent序贯决策研究

国家自然科学基金

44+阅读 · 2015年12月31日

动态Gr？bner 基与GVW算法

国家自然科学基金

0+阅读 · 2014年12月31日

A1AR保护糖尿病肾小管周微环境的非管球反馈机制

国家自然科学基金

0+阅读 · 2014年12月31日

固本化瘀通络法调控糖尿病肾病足细胞自噬异常的分子机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

基于主动表观模型的MR脑图像海马自动识别和三维分割法联合fMRI多模态成像模式用于AD早期诊断

国家自然科学基金

0+阅读 · 2013年12月31日

上下文感知的Web服务自适应计算模型研究

国家自然科学基金

0+阅读 · 2012年12月31日

超薄铁电超晶格中应变-界面-铁电性能耦合行为研究

国家自然科学基金

0+阅读 · 2009年12月31日

随机延时神经网络的动力学分析

国家自然科学基金

0+阅读 · 2008年12月31日

A Stochastic-Gradient-based Interior-Point Algorithm for Solving Smooth Bound-Constrained Optimization Problems

Arxiv

0+阅读 · 2023年4月28日

Incorporating Recurrent Reinforcement Learning into Model Predictive Control for Adaptive Control in Autonomous Driving

Arxiv

0+阅读 · 2023年4月27日

Adaptive Collective Responses to Local Stimuli in Anonymous Dynamic Networks

Arxiv

0+阅读 · 2023年4月26日

TR0N: Translator Networks for 0-Shot Plug-and-Play Conditional Generation

Arxiv

0+阅读 · 2023年4月26日

Scene Graph Lossless Compression with Adaptive Prediction for Objects and Relations

Arxiv

0+阅读 · 2023年4月26日

Numerical Analysis for Real-time Nonlinear Model Predictive Control of Ethanol Steam Reformers

Arxiv

0+阅读 · 2023年4月26日

Adaptive Transfer Learning on Graph Neural Networks

Arxiv

14+阅读 · 2021年7月20日

Graph Contrastive Learning with Adaptive Augmentation

Arxiv

10+阅读 · 2021年2月26日

MetaCURE: Meta Reinforcement Learning with Empowerment-Driven Exploration

Arxiv

12+阅读 · 2021年2月7日

nnU-Net: Self-adapting Framework for U-Net-Based Medical Image Segmentation

Arxiv

12+阅读 · 2018年9月27日

VIP会员

文章信息

相关主题

相关VIP内容

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

Meta最新WWW2022《联邦计算导论》教程，附77页ppt

Meta最新WWW2022《联邦计算导论》教程，附77页ppt

专知会员服务

60+阅读 · 2022年5月5日

“CVPR 2021 接受论文列表 1663篇论文都在这了

专知会员服务

32+阅读 · 2021年6月12日

近期必读的6篇CVPR 2020【域自适应（Domain Adaptation）】相关论文和代码

近期必读的6篇CVPR 2020【域自适应（Domain Adaptation）】相关论文和代码

专知会员服务

96+阅读 · 2020年3月24日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【CMU博士论文】数据驱动决策中的激励、信息与不确定性

DGP双粒度提示框架：图增强大模型助力欺诈检测

【ICCV2025】ESSENTIAL：用于视频类增量学习的情景记忆与语义记忆整合

唯快不破：大型语言模型高效架构综述

相关资讯

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知

133+阅读 · 2020年3月18日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【代码资源】GAN | 七份最热GAN文章及代码分享（Github 1000+Stars）

【代码资源】GAN | 七份最热GAN文章及代码分享（Github 1000+Stars）

专知

13+阅读 · 2018年6月24日

最新5篇生成对抗网络相关论文推荐—FusedGAN、DeblurGAN、AdvGAN、CipherGAN、MMD GANS

最新5篇生成对抗网络相关论文推荐—FusedGAN、DeblurGAN、AdvGAN、CipherGAN、MMD GANS

专知

23+阅读 · 2018年1月18日

MoCoGAN 分解运动和内容的视频生成

MoCoGAN 分解运动和内容的视频生成

CreateAMind

18+阅读 · 2017年10月21日

相关论文

A Stochastic-Gradient-based Interior-Point Algorithm for Solving Smooth Bound-Constrained Optimization Problems

Arxiv

0+阅读 · 2023年4月28日

Incorporating Recurrent Reinforcement Learning into Model Predictive Control for Adaptive Control in Autonomous Driving

Arxiv

0+阅读 · 2023年4月27日

Adaptive Collective Responses to Local Stimuli in Anonymous Dynamic Networks

Arxiv

0+阅读 · 2023年4月26日

TR0N: Translator Networks for 0-Shot Plug-and-Play Conditional Generation

Arxiv

0+阅读 · 2023年4月26日

Scene Graph Lossless Compression with Adaptive Prediction for Objects and Relations

Arxiv

0+阅读 · 2023年4月26日

Numerical Analysis for Real-time Nonlinear Model Predictive Control of Ethanol Steam Reformers

Arxiv

0+阅读 · 2023年4月26日

Adaptive Transfer Learning on Graph Neural Networks

Arxiv

14+阅读 · 2021年7月20日

Graph Contrastive Learning with Adaptive Augmentation

Arxiv

10+阅读 · 2021年2月26日

MetaCURE: Meta Reinforcement Learning with Empowerment-Driven Exploration

Arxiv

12+阅读 · 2021年2月7日

nnU-Net: Self-adapting Framework for U-Net-Based Medical Image Segmentation

Arxiv

12+阅读 · 2018年9月27日

相关基金

一类离散Hindmarsh-Rose模型的分支延拓

国家自然科学基金

0+阅读 · 2015年12月31日

早期阿尔茨海默病胼胝体与海马的时空特征研究

国家自然科学基金

0+阅读 · 2015年12月31日

基于自主学习的Ad hoc Agent序贯决策研究

国家自然科学基金

44+阅读 · 2015年12月31日

动态Gr？bner 基与GVW算法

国家自然科学基金

0+阅读 · 2014年12月31日

A1AR保护糖尿病肾小管周微环境的非管球反馈机制

国家自然科学基金

0+阅读 · 2014年12月31日

固本化瘀通络法调控糖尿病肾病足细胞自噬异常的分子机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

基于主动表观模型的MR脑图像海马自动识别和三维分割法联合fMRI多模态成像模式用于AD早期诊断

国家自然科学基金

0+阅读 · 2013年12月31日

上下文感知的Web服务自适应计算模型研究

国家自然科学基金

0+阅读 · 2012年12月31日

超薄铁电超晶格中应变-界面-铁电性能耦合行为研究

国家自然科学基金

0+阅读 · 2009年12月31日

随机延时神经网络的动力学分析

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员