通过以频率为基础的政策进行共享控制 (Human-AI Shared Control via Frequency-based Policy Dissection) - 专知论文

会员服务 ·

0

控制器 · Learning · INTERACT · HTTPS · SimPLe ·

2022 年 9 月 25 日

Human-AI Shared Control via Frequency-based Policy Dissection

翻译：通过以频率为基础的政策进行共享控制

Quanyi Li,Zhenghao Peng,Haibin Wu,Lan Feng,Bolei Zhou

Human-AI shared control allows human to interact and collaborate with AI to accomplish control tasks in complex environments. Previous Reinforcement Learning (RL) methods attempt the goal-conditioned design to achieve human-controllable policies at the cost of redesigning the reward function and training paradigm. Inspired by the neuroscience approach to investigate the motor cortex in primates, we develop a simple yet effective frequency-based approach called \textit{Policy Dissection} to align the intermediate representation of the learned neural controller with the kinematic attributes of the agent behavior. Without modifying the neural controller or retraining the model, the proposed approach can convert a given RL-trained policy into a human-interactive policy. We evaluate the proposed approach on the RL tasks of autonomous driving and locomotion. The experiments show that human-AI shared control achieved by Policy Dissection in driving task can substantially improve the performance and safety in unseen traffic scenes. With human in the loop, the locomotion robots also exhibit versatile controllable motion skills even though they are only trained to move forward. Our results suggest the promising direction of implementing human-AI shared autonomy through interpreting the learned representation of the autonomous agents. Demo video and code will be made available at https://metadriverse.github.io/policydissect.

翻译：人类-AI 共享控制使人类能够与AI互动并合作,在复杂环境中完成控制任务。以前的加强学习方法(RL)试图以重新设计奖励功能和培训模式为代价,将目标设计设计转化为人控制政策,实现人控制政策。在神经科学方法的启发下,我们调查灵长类运动皮层,我们开发了一个简单而有效的基于频率的方法,称为\textit{政策分解},使学习的神经控制器的中间代表与代理人行为的动态属性相一致。在不修改神经控制器或再培训模型的情况下,拟议的方法可以将特定受RL培训的政策转化为人际互动政策。我们评估了RL自主驾驶和运动模式任务的拟议方法。实验表明,政策分解在驾驶任务中实现的人类-AI共同控制可以大大改善隐蔽交通场的性和安全性。在循环中,移动机器人也展示了可控的移动运动技能,即使他们只受过前行训练。我们的结果表明,通过解释可操作的自动代理机构/Dimaltimetal 代码,在可理解的自动代理机构中实施人类-AI共同自治。

0

相关内容

控制器

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

Artificial Intelligence: Ready to Ride the Wave? BCG 28页PPT

Artificial Intelligence: Ready to Ride the Wave? BCG 28页PPT

专知会员服务

28+阅读 · 2022年2月20日

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

31+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

蓖麻矮化相关RcDof基因功能分析及调控机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

胰岛素抵抗和Foxo信号对肝纤维化的调控

国家自然科学基金

0+阅读 · 2014年12月31日

GC-SIRT1-SREBP1c信号介导孕期尼古丁暴露所致子代NAFLD易感的宫内编程机制

国家自然科学基金

0+阅读 · 2014年12月31日

胃癌中NKD2基因的甲基化调控和信号通路研究

国家自然科学基金

0+阅读 · 2013年12月31日

Caveolae和Rho激酶信号传导通路在PPARs调节血管内皮细胞中缝隙连接蛋白的作用

国家自然科学基金

0+阅读 · 2012年12月31日

CHOP介导的内质网应激在针刺干预自发性糖尿病大鼠胰岛β细胞凋亡中的影响及机制

国家自然科学基金

0+阅读 · 2012年12月31日

RGM与neogenin信号调控应激性精神障碍-PTSD杏仁核、海马神经细胞凋亡的分子机制

国家自然科学基金

0+阅读 · 2011年12月31日

SHOX基因下游增强子的识别及调控活性分析

国家自然科学基金

0+阅读 · 2011年12月31日

组蛋白乙酰化修饰调控COPD气道平滑肌细胞增殖及中药干预机制

国家自然科学基金

0+阅读 · 2011年12月31日

SARI基因在肺癌侵袭转移中的作用及分子机制

国家自然科学基金

0+阅读 · 2009年12月31日

Order-optimal Joint Transmission and Identification in Massive Multi-User MIMO via Group Testing

Arxiv

0+阅读 · 2022年11月2日

Guided Conditional Diffusion for Controllable Traffic Simulation

Guided Conditional Diffusion for Controllable Traffic Simulation

Arxiv

0+阅读 · 2022年10月31日

Unsupervised Learning of Structured Representations via Closed-Loop Transcription

Arxiv

0+阅读 · 2022年10月30日

Novel Policy Seeking with Constrained Optimization

Arxiv

0+阅读 · 2022年10月29日

Self-Improving Safety Performance of Reinforcement Learning Based Driving with Black-Box Verification Algorithms

Arxiv

0+阅读 · 2022年10月29日

DeFIX: Detecting and Fixing Failure Scenarios with Reinforcement Learning in Imitation Learning Based Autonomous Driving

Arxiv

0+阅读 · 2022年10月29日

A Multilevel Reinforcement Learning Framework for PDE-based Control

Arxiv

0+阅读 · 2022年10月28日

A Game Benchmark for Real-Time Human-Swarm Control

Arxiv

0+阅读 · 2022年10月28日

ERL-Re$^2$: Efficient Evolutionary Reinforcement Learning with Shared State Representation and Individual Policy Representation

Arxiv

1+阅读 · 2022年10月26日

A Survey on Reinforcement Learning for Recommender Systems

Arxiv

22+阅读 · 2021年9月22日

VIP会员

文章信息

相关主题

相关VIP内容

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

Artificial Intelligence: Ready to Ride the Wave? BCG 28页PPT

Artificial Intelligence: Ready to Ride the Wave? BCG 28页PPT

专知会员服务

28+阅读 · 2022年2月20日

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

31+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

《俄乌战争背景下俄罗斯的战略性海军分析（2022-2025年）》最新100页报告

【斯坦福博士论文】数据、决策与依赖：构建可信人工智能的挑战

人工智能时代背景下的未来海战

接触战中的无人机优势：美军旅级部队面临的小型无人机系统挑战与调整

相关资讯

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

相关论文

Order-optimal Joint Transmission and Identification in Massive Multi-User MIMO via Group Testing

Arxiv

0+阅读 · 2022年11月2日

Guided Conditional Diffusion for Controllable Traffic Simulation

Guided Conditional Diffusion for Controllable Traffic Simulation

Arxiv

0+阅读 · 2022年10月31日

Unsupervised Learning of Structured Representations via Closed-Loop Transcription

Arxiv

0+阅读 · 2022年10月30日

Novel Policy Seeking with Constrained Optimization

Arxiv

0+阅读 · 2022年10月29日

Self-Improving Safety Performance of Reinforcement Learning Based Driving with Black-Box Verification Algorithms

Arxiv

0+阅读 · 2022年10月29日

DeFIX: Detecting and Fixing Failure Scenarios with Reinforcement Learning in Imitation Learning Based Autonomous Driving

Arxiv

0+阅读 · 2022年10月29日

A Multilevel Reinforcement Learning Framework for PDE-based Control

Arxiv

0+阅读 · 2022年10月28日

A Game Benchmark for Real-Time Human-Swarm Control

Arxiv

0+阅读 · 2022年10月28日

ERL-Re$^2$: Efficient Evolutionary Reinforcement Learning with Shared State Representation and Individual Policy Representation

Arxiv

1+阅读 · 2022年10月26日

A Survey on Reinforcement Learning for Recommender Systems

Arxiv

22+阅读 · 2021年9月22日

相关基金

蓖麻矮化相关RcDof基因功能分析及调控机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

胰岛素抵抗和Foxo信号对肝纤维化的调控

国家自然科学基金

0+阅读 · 2014年12月31日

GC-SIRT1-SREBP1c信号介导孕期尼古丁暴露所致子代NAFLD易感的宫内编程机制

国家自然科学基金

0+阅读 · 2014年12月31日

胃癌中NKD2基因的甲基化调控和信号通路研究

国家自然科学基金

0+阅读 · 2013年12月31日

Caveolae和Rho激酶信号传导通路在PPARs调节血管内皮细胞中缝隙连接蛋白的作用

国家自然科学基金

0+阅读 · 2012年12月31日

CHOP介导的内质网应激在针刺干预自发性糖尿病大鼠胰岛β细胞凋亡中的影响及机制

国家自然科学基金

0+阅读 · 2012年12月31日

RGM与neogenin信号调控应激性精神障碍-PTSD杏仁核、海马神经细胞凋亡的分子机制

国家自然科学基金

0+阅读 · 2011年12月31日

SHOX基因下游增强子的识别及调控活性分析

国家自然科学基金

0+阅读 · 2011年12月31日

组蛋白乙酰化修饰调控COPD气道平滑肌细胞增殖及中药干预机制

国家自然科学基金

0+阅读 · 2011年12月31日

SARI基因在肺癌侵袭转移中的作用及分子机制

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员