基于自主学习的Ad hoc Agent序贯决策研究 - 专知基金

会员服务 ·

10

决策方式 · 模型学习 · 不确定性 ·

2015 年 12 月 31 日

基于自主学习的Ad hoc Agent序贯决策研究

国家自然科学基金

国家自然科学基金委员会

项目名称： 基于自主学习的Ad hoc Agent序贯决策研究

项目编号： No.61502322

项目类型： 青年科学基金项目

立项/批准年度： 2016

项目学科： 其他

项目作者： 陈盈科

作者单位： 四川大学

项目金额： 20万元

中文摘要： 多智能体（Agent）决策技术的研究常假设智能体之间通过通信与协调来完成既定任务。该假设不适用于具有竞争关系的多智能体系统。因此，在未知决策环境下，开发具有自适应能力的智能体，即Ad hoc Agent，是多智能体研究领域极具挑战的新兴问题。本项目将提出一个基于个体智能体自主学习与决策的新框架，以构造并求解多Ad hoc Agent序贯决策问题。其主要研究内容包括：通过机器学习方法，使Ad hoc Agent能从交互数据中自主构造出准确刻画其他智能体行为特征的模型，并更新自身的决策模型；在此基础上，将针对个体智能体行为模型的学习算法，推广到学习群体智能体抽象行为中；最终搭建一个以无人驾驶飞机仿真为背景的Ad hoc Agent仿真平台。本项目期望构造能自主发掘并合理应对陌生智能体行为的新型Ad hoc Agent，为将多智能体技术应用于更加复杂多变的现实场景中，提供理论依据与实践指导。

中文关键词： 决策方式；模型学习；不确定性

英文摘要： Multi-agent decision making techniques always assume cooperative agents that can resolve pre-defined tasks through communication and coordination. The techniques however are not applicable for solving decision problems with competitive agents. It is a challenge to develop an adaptive agent, namely Ad hoc agent, that can construct and solve decision problems in an environment commonly shared by other agents of unknown relationships. This project will solve sequential decision making problems involving Ad hoc agents from individual agent perspective. A subject agent will learn behavior of other ad hoc agents by adapting machine learning techniques, and accordingly update its own decision models. This project will extend learning algorithms for constructing behavioral model of a single agent to learn behavioral patterns of a population of other agents. Based on the scenario of unmanned aerial vehicle, this project will build a platform for simulating interactions, performing learning and conducting evaluation for ad hoc agents. In summary, this project will develop a new type of Ad hoc agent that can actively explore the environment with other unknown agents. The research outcomes will facilitate applications of multi-agent technologies in complex problem domains, and provide theoretical guarantees and practical guidelines.

英文关键词： Decision Making;Model Learning;Uncertainty

成为VIP会员查看完整内容

46

相关内容

决策方式

【多智能体学习】DeepMind教程，231页PPT

【多智能体学习】DeepMind教程，231页PPT

专知会员服务

128+阅读 · 2022年3月25日

【AAAI2022】一种基于状态扰动的鲁棒强化学习算法

【AAAI2022】一种基于状态扰动的鲁棒强化学习算法

专知会员服务

36+阅读 · 2022年1月31日

【2021新书】分布式优化，博弈和学习算法，227页pdf

【2021新书】分布式优化，博弈和学习算法，227页pdf

专知会员服务

237+阅读 · 2021年5月25日

「元强化学习」报告，斯坦福Chelsea Finn讲解，52页ppt，Meta Reinforcement Learning

「元强化学习」报告，斯坦福Chelsea Finn讲解，52页ppt，Meta Reinforcement Learning

专知会员服务

43+阅读 · 2021年1月11日

多Agent深度强化学习综述(中文版)，21页pdf

专知会员服务

118+阅读 · 2021年1月1日

【Alma Mate博士论文】深度架构持续学习，附150页pdf与Slides

【Alma Mate博士论文】深度架构持续学习，附150页pdf与Slides

专知会员服务

47+阅读 · 2020年11月18日

【康奈尔】最新《强化学习基础》CS 6789课程

【康奈尔】最新《强化学习基础》CS 6789课程

专知会员服务

70+阅读 · 2020年9月27日

【ICML2020】基于模型的强化学习方法教程，279页ppt

【ICML2020】基于模型的强化学习方法教程，279页ppt

专知会员服务

129+阅读 · 2020年7月20日

【伯克利博士论文】如何让机器人多技能？通过最大熵强化学习(107页pdf)

【伯克利博士论文】如何让机器人多技能？通过最大熵强化学习(107页pdf)

专知会员服务

78+阅读 · 2019年10月27日

【ICML2019 Tutorials】元学习：从小样本学习到快速强化学习(Meta-Learning: from Few-Shot Learning to Rapid Reinforcement Learning)，Google Brain的研究科学家| Chelsea Finn，加州大学伯克利分校| Sergey Levine

【ICML2019 Tutorials】元学习：从小样本学习到快速强化学习(Meta-Learning: from Few-Shot Learning to Rapid Reinforcement Learning)，Google Brain的研究科学家| Chelsea Finn，加州大学伯克利分校| Sergey Levine

专知会员服务

55+阅读 · 2019年6月10日

【AAAI2022】一种基于状态扰动的鲁棒强化学习算法

【AAAI2022】一种基于状态扰动的鲁棒强化学习算法

专知

3+阅读 · 2022年1月31日

基于自监督的可逆性强化学习方法

基于自监督的可逆性强化学习方法

AI前线

4+阅读 · 2021年12月3日

微信看一看强化学习推荐模型的知识蒸馏探索之路丨CIKM 2021

微信看一看强化学习推荐模型的知识蒸馏探索之路丨CIKM 2021

微信AI

2+阅读 · 2021年12月2日

基于深度强化学习的机器人运动控制研究进展

基于深度强化学习的机器人运动控制研究进展

专知

3+阅读 · 2021年4月22日

DAI2020 SMARTS 自动驾驶挑战赛(深度强化学习)

DAI2020 SMARTS 自动驾驶挑战赛(深度强化学习)

深度强化学习实验室

15+阅读 · 2020年8月15日

Meta-Learning 元学习：学会快速学习

Meta-Learning 元学习：学会快速学习

极市平台

75+阅读 · 2018年12月19日

深度强化学习入门，这一篇就够了！

深度强化学习入门，这一篇就够了！

机器学习算法与Python学习

28+阅读 · 2018年8月17日

干货｜浅谈强化学习的方法及学习路线

干货｜浅谈强化学习的方法及学习路线

机器学习算法与Python学习

16+阅读 · 2018年3月28日

【强化学习】易忽略的强化学习知识之基础知识及MDP

【强化学习】易忽略的强化学习知识之基础知识及MDP

产业智能官

19+阅读 · 2017年12月22日

【DRL教程学习笔记01】AlphaGo Zero核心技术- 深度强化学习简介

【DRL教程学习笔记01】AlphaGo Zero核心技术- 深度强化学习简介

专知

17+阅读 · 2017年10月20日

云计算环境下移动Agent系统信任安全关键技术研究

国家自然科学基金

2+阅读 · 2014年12月31日

物理辅助网络系统中场景感知的信息传输机制研究

国家自然科学基金

2+阅读 · 2013年12月31日

基于逆向强化学习和人工智能的移动机器人自主学习方法研究

国家自然科学基金

12+阅读 · 2013年12月31日

基于交互式动态影响图的未知对手模型学习

国家自然科学基金

3+阅读 · 2012年12月31日

复杂无线环境下的主动跨层恶意节点定位算法研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于Agent的智能化元搜索引擎模型及关键技术

国家自然科学基金

3+阅读 · 2012年12月31日

面向任务的网络公用品博弈群体协调和合作机制研究

国家自然科学基金

2+阅读 · 2012年12月31日

基于矩阵与图理论的多智能体一致性分析研究

国家自然科学基金

2+阅读 · 2011年12月31日

基于群体智能的多Agent协作模型与适应性研究

国家自然科学基金

17+阅读 · 2009年12月31日

多智能体网络系统的一致性协调控制

国家自然科学基金

3+阅读 · 2009年12月31日

SAAC: Safe Reinforcement Learning as an Adversarial Game of Actor-Critics

Arxiv

1+阅读 · 2022年4月20日

Mingling Foresight with Imagination: Model-Based Cooperative Multi-Agent Reinforcement Learning

Arxiv

1+阅读 · 2022年4月20日

User-oriented Natural Human-Robot Control with Thin-Plate Splines and LRCN

Arxiv

0+阅读 · 2022年4月19日

MetaDrive: Composing Diverse Driving Scenarios for Generalizable Reinforcement Learning

Arxiv

0+阅读 · 2022年4月18日

Towards Comprehensive Testing on the Robustness of Cooperative Multi-agent Reinforcement Learning

Arxiv

0+阅读 · 2022年4月17日

Methodical Advice Collection and Reuse in Deep Reinforcement Learning

Arxiv

1+阅读 · 2022年4月14日

Pre-training Methods in Information Retrieval

Arxiv

16+阅读 · 2021年11月27日

Q-value Path Decomposition for Deep Multiagent Reinforcement Learning

Q-value Path Decomposition for Deep Multiagent Reinforcement Learning

Arxiv

26+阅读 · 2020年2月10日

Representation Learning with Ordered Relation Paths for Knowledge Graph Completion

Representation Learning with Ordered Relation Paths for Knowledge Graph Completion

Arxiv

12+阅读 · 2019年9月26日

Multiagent Soft Q-Learning

Arxiv

11+阅读 · 2018年4月25日

阅读: 0 点赞: 0

小贴士

登录享主题订阅及个性化推荐

相关主题

热门VIP内容

开通专知VIP会员享更多权益服务

《无人机系统 - 反无人机系统：测试方法》364页

《无人机蜂群攻击防御的预测建模：面向美军战备的人工智能轨迹预测与最优拦截策略设计》最新报告

美军低成本无人作战攻击系统（LUCAS）：扩大无人机战争规模

《将空中力量带向海洋：美国海军航空发展的四条竞争路径及其教训》报告

相关VIP内容

【多智能体学习】DeepMind教程，231页PPT

【多智能体学习】DeepMind教程，231页PPT

专知会员服务

128+阅读 · 2022年3月25日

【AAAI2022】一种基于状态扰动的鲁棒强化学习算法

【AAAI2022】一种基于状态扰动的鲁棒强化学习算法

专知会员服务

36+阅读 · 2022年1月31日

【2021新书】分布式优化，博弈和学习算法，227页pdf

【2021新书】分布式优化，博弈和学习算法，227页pdf

专知会员服务

237+阅读 · 2021年5月25日

「元强化学习」报告，斯坦福Chelsea Finn讲解，52页ppt，Meta Reinforcement Learning

「元强化学习」报告，斯坦福Chelsea Finn讲解，52页ppt，Meta Reinforcement Learning

专知会员服务

43+阅读 · 2021年1月11日

多Agent深度强化学习综述(中文版)，21页pdf

专知会员服务

118+阅读 · 2021年1月1日

【Alma Mate博士论文】深度架构持续学习，附150页pdf与Slides

【Alma Mate博士论文】深度架构持续学习，附150页pdf与Slides

专知会员服务

47+阅读 · 2020年11月18日

【康奈尔】最新《强化学习基础》CS 6789课程

【康奈尔】最新《强化学习基础》CS 6789课程

专知会员服务

70+阅读 · 2020年9月27日

【ICML2020】基于模型的强化学习方法教程，279页ppt

【ICML2020】基于模型的强化学习方法教程，279页ppt

专知会员服务

129+阅读 · 2020年7月20日

【伯克利博士论文】如何让机器人多技能？通过最大熵强化学习(107页pdf)

【伯克利博士论文】如何让机器人多技能？通过最大熵强化学习(107页pdf)

专知会员服务

78+阅读 · 2019年10月27日

【ICML2019 Tutorials】元学习：从小样本学习到快速强化学习(Meta-Learning: from Few-Shot Learning to Rapid Reinforcement Learning)，Google Brain的研究科学家| Chelsea Finn，加州大学伯克利分校| Sergey Levine

【ICML2019 Tutorials】元学习：从小样本学习到快速强化学习(Meta-Learning: from Few-Shot Learning to Rapid Reinforcement Learning)，Google Brain的研究科学家| Chelsea Finn，加州大学伯克利分校| Sergey Levine

专知会员服务

55+阅读 · 2019年6月10日

相关资讯

【AAAI2022】一种基于状态扰动的鲁棒强化学习算法

【AAAI2022】一种基于状态扰动的鲁棒强化学习算法

专知

3+阅读 · 2022年1月31日

基于自监督的可逆性强化学习方法

基于自监督的可逆性强化学习方法

AI前线

4+阅读 · 2021年12月3日

微信看一看强化学习推荐模型的知识蒸馏探索之路丨CIKM 2021

微信看一看强化学习推荐模型的知识蒸馏探索之路丨CIKM 2021

微信AI

2+阅读 · 2021年12月2日

基于深度强化学习的机器人运动控制研究进展

基于深度强化学习的机器人运动控制研究进展

专知

3+阅读 · 2021年4月22日

DAI2020 SMARTS 自动驾驶挑战赛(深度强化学习)

DAI2020 SMARTS 自动驾驶挑战赛(深度强化学习)

深度强化学习实验室

15+阅读 · 2020年8月15日

Meta-Learning 元学习：学会快速学习

Meta-Learning 元学习：学会快速学习

极市平台

75+阅读 · 2018年12月19日

深度强化学习入门，这一篇就够了！

深度强化学习入门，这一篇就够了！

机器学习算法与Python学习

28+阅读 · 2018年8月17日

干货｜浅谈强化学习的方法及学习路线

干货｜浅谈强化学习的方法及学习路线

机器学习算法与Python学习

16+阅读 · 2018年3月28日

【强化学习】易忽略的强化学习知识之基础知识及MDP

【强化学习】易忽略的强化学习知识之基础知识及MDP

产业智能官

19+阅读 · 2017年12月22日

【DRL教程学习笔记01】AlphaGo Zero核心技术- 深度强化学习简介

【DRL教程学习笔记01】AlphaGo Zero核心技术- 深度强化学习简介

专知

17+阅读 · 2017年10月20日

相关基金

云计算环境下移动Agent系统信任安全关键技术研究

国家自然科学基金

2+阅读 · 2014年12月31日

物理辅助网络系统中场景感知的信息传输机制研究

国家自然科学基金

2+阅读 · 2013年12月31日

基于逆向强化学习和人工智能的移动机器人自主学习方法研究

国家自然科学基金

12+阅读 · 2013年12月31日

基于交互式动态影响图的未知对手模型学习

国家自然科学基金

3+阅读 · 2012年12月31日

复杂无线环境下的主动跨层恶意节点定位算法研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于Agent的智能化元搜索引擎模型及关键技术

国家自然科学基金

3+阅读 · 2012年12月31日

面向任务的网络公用品博弈群体协调和合作机制研究

国家自然科学基金

2+阅读 · 2012年12月31日

基于矩阵与图理论的多智能体一致性分析研究

国家自然科学基金

2+阅读 · 2011年12月31日

基于群体智能的多Agent协作模型与适应性研究

国家自然科学基金

17+阅读 · 2009年12月31日

多智能体网络系统的一致性协调控制

国家自然科学基金

3+阅读 · 2009年12月31日

相关论文

SAAC: Safe Reinforcement Learning as an Adversarial Game of Actor-Critics

Arxiv

1+阅读 · 2022年4月20日

Mingling Foresight with Imagination: Model-Based Cooperative Multi-Agent Reinforcement Learning

Arxiv

1+阅读 · 2022年4月20日

User-oriented Natural Human-Robot Control with Thin-Plate Splines and LRCN

Arxiv

0+阅读 · 2022年4月19日

MetaDrive: Composing Diverse Driving Scenarios for Generalizable Reinforcement Learning

Arxiv

0+阅读 · 2022年4月18日

Towards Comprehensive Testing on the Robustness of Cooperative Multi-agent Reinforcement Learning

Arxiv

0+阅读 · 2022年4月17日

Methodical Advice Collection and Reuse in Deep Reinforcement Learning

Arxiv

1+阅读 · 2022年4月14日

Pre-training Methods in Information Retrieval

Arxiv

16+阅读 · 2021年11月27日

Q-value Path Decomposition for Deep Multiagent Reinforcement Learning

Q-value Path Decomposition for Deep Multiagent Reinforcement Learning

Arxiv

26+阅读 · 2020年2月10日

Representation Learning with Ordered Relation Paths for Knowledge Graph Completion

Representation Learning with Ordered Relation Paths for Knowledge Graph Completion

Arxiv

12+阅读 · 2019年9月26日

Multiagent Soft Q-Learning

Arxiv

11+阅读 · 2018年4月25日

微信扫码咨询专知VIP会员