顶会论文 || 65篇"IJCAI"深度强化学习论文汇总

2020 年 3 月 18 日 专知

深度强化学习实验室报道

来源:IJCAI

编辑:DeepRL

  1. A Dual Reinforcement Learning Framework for Unsupervised Text Style Transfer: Fuli Luo, Peng Li, Jie Zhou, Pengcheng Yang, Baobao Chang, Xu Sun, Zhifang Sui
  2. A Restart-based Rank-1 Evolution Strategy for Reinforcement Learning: Zefeng Chen, Yuren Zhou, Xiao-yu He, Siyu Jiang
  3. An Actor-Critic-Attention Mechanism for Deep Reinforcement Learning in Multi-view Environments:Elaheh Barati, Xuewen Chen
  4. An Atari Model Zoo for Analyzing, Visualizing, and Comparing Deep Reinforcement Learning Agents: Felipe Such, Vashisht Madhavan, Rosanne Liu, Rui Wang, Pablo Samuel Castro, Yulun Li, Jiale Zhi, Ludwig Schubert, Marc G. Bellemare, Jeff Clune, Joel Lehman
  5. Automatic Successive Reinforcement Learning with Multiple Auxiliary Rewards: Zhao-Yang Fu, De-Chuan Zhan, Xin-Chun Li, Yi-Xing Lu
  6. Autoregressive Policies for Continuous Control Deep Reinforcement Learning:Dmytro Korenkevych, Ashique Rupam Mahmood, Gautham Vasan, James Bergstra
  7. Deep Multi-Agent Reinforcement Learning with Discrete-Continuous Hybrid Action Spaces :Haotian Fu, Hongyao Tang, Jianye Hao, Zihan Lei, Yingfeng Chen, Changjie Fan
  8. Dynamic Electronic Toll Collection via Multi-Agent Deep Reinforcement Learning with Edge-Based Graph Convolutional Network Representation:Wei Qiu, Haipeng Chen, Bo An
  9. Energy-Efficient Slithering Gait Exploration for a Snake-Like Robot Based on Reinforcement Learning: Zhenshan Bing, Christian Lemke, Zhuangyi Jiang, Kai Huang, Alois Knoll
  10. Explaining Reinforcement Learning to Mere Mortals: An Empirical Study: Andrew Anderson, Jonathan Dodge, Amrita Sadarangani, Zoe Juozapaitis, Evan Newman, Jed Irvine, Souti Chattopadhyay, Alan Fern, Margaret Burnett
  11. Hybrid Actor-Critic Reinforcement Learning in Parameterized Action Space: Zhou Fan, Rui Su, Weinan Zhang, Yong Yu
  12. Incremental Learning of Planning Actions in Model-Based Reinforcement Learning: Alvin Ng, Ron Petrick
  13. Interactive Reinforcement Learning with Dynamic Reuse of Prior Knowledge from Human/Agent's Demonstration: Zhaodong Wang, Matt Taylor
  14. Interactive Teaching Algorithms for Inverse Reinforcement Learning: Parameswaran Kamalaruban, Rati Devidze, Volkan Cevher, Adish Singla
  15. Large-Scale Home Energy Management Using Entropy-Based Collective Multiagent Deep Reinforcement Learning: Yaodong Yang, Jianye Hao, Yan Zheng, Chao Yu
  16. Meta Reinforcement Learning with Task Embedding and Shared Policy: Lin Lan, Zhenguo Li, Xiaohong Guan, Pinghui Wang
  17. Metatrace Actor-Critic: Online Step-Size Tuning by Meta-gradient Descent for Reinforcement Learning Control: Kenny Young, Baoxiang Wang, Matthew E. Taylor
  18. Playing Card-Based RTS Games with Deep Reinforcement Learning: Tianyu Liu, Zijie Zheng, Hongchang Li, Kaigui Bian, Lingyang Song
  19. Playing FPS Games With Environment-Aware Hierarchical Reinforcement Learning: Shihong Song, Jiayi Weng, Hang Su, Dong Yan, Haosheng Zou, Jun Zhu
  20. Reinforcement Learning Experience Reuse with Policy Residual Representation: WenJi Zhou, Yang Yu, Yingfeng Chen, Kai Guan, Tangjie Lv, Changjie Fan, Zhi-Hua Zhou
  21. Reward Learning for Efficient Reinforcement Learning in Extractive Document Summarisation: Yang Gao, Christian Meyer, Mohsen Mesgar, Iryna Gurevych
  22. Sharing Experience in Multitask Reinforcement Learning: Tung-Long Vuong, Do-Van Nguyen, Tai-Long Nguyen, Cong-Minh Bui, Hai-Dang Kieu, Viet-Cuong Ta, Quoc-Long Tran, Thanh-Ha Le
  23. SlateQ: A Tractable Decomposition for Reinforcement Learning with Recommendation Sets: Eugene Ie, Vihan Jain, Jing Wang, Sanmit Narvekar, Ritesh Agarwal, Rui Wu, Heng-Tze Cheng, Tushar Chandra, Craig Boutilier
  24. Soft Policy Gradient Method for Maximum Entropy Deep Reinforcement Learning: Wenjie Shi, Shiji Song, Cheng Wu
  25. Solving Continual Combinatorial Selection via Deep Reinforcement Learning: HyungSeok Song, Hyeryung Jang, Hai H. Tran, Se-eun Yoon, Kyunghwan Son, Donggyu Yun, Hyoju Chung, Yung Yi
  26. Successor Options: An Option Discovery Framework for Reinforcement Learning: Rahul Ramesh, Manan Tomar, Balaraman Ravindran
  27. Transfer of Temporal Logic Formulas in Reinforcement Learning: Zhe Xu, Ufuk Topcu
  28. Using Natural Language for Reward Shaping in Reinforcement Learning: Prasoon Goyal, Scott Niekum, Raymond Mooney
  29. Value Function Transfer for Deep Multi-Agent Reinforcement Learning Based on N-Step Returns: Yong Liu, Yujing Hu, Yang Gao, Yingfeng Chen, Changjie Fan
  30. Failure-Scenario Maker for Rule-Based Agent using Multi-agent Adversarial Reinforcement Learning and its Application to Autonomous Driving: Akifumi Wachi
  31. LTL and Beyond: Formal Languages for Reward Function Specification in Reinforcement Learning: Alberto Camacho, Rodrigo Toro Icarte, Toryn Q. Klassen, Richard Valenzano, Sheila McIlraith
  32. A Survey of Reinforcement Learning Informed by Natural Language: Jelena Luketina↵, Nantas Nardelli, Gregory Farquhar, Jakob Foerster, Jacob Andreas, Edward Grefenstett, Shimon Whiteson, Tim Rocktäschel
  33. Leveraging Human Guidance for Deep Reinforcement Learning Tasks: Ruohan Zhang, Faraz Torabi, Lin Guan, Dana H. Ballard, Peter Stone
  34. CRSRL: Customer Routing System using Reinforcement Learning: Chong Long, Zining Liu, Xiaolu Lu, Zehong Hu, Yafang Wang
  35. Deep Reinforcement Learning for Ride-sharing Dispatching and Repositioning: Zhiwei (Tony) Qin, Xiaocheng Tang, Yan Jiao, Fan Zhang, Chenxi Wang
  36. Learning Deep Decentralized Policy Network by Collective Rewards for Real-Time Combat Game: Peixi Peng, Junliang Xing, Lili Cao, Lisen Mu, Chang Huang
  37. Monte Carlo Tree Search for Policy Optimization: Xiaobai Ma, Katherine Driggs-Campbell, Zongzhang Zhang, Mykel J. Kochenderfer
  38. On Principled Entropy Exploration in Policy Optimization: Jincheng Mei, Chenjun Xiao, Ruitong Huang, Dale Schuurmans, Martin Müller
  39. Recurrent Existence Determination Through Policy Optimization: Baoxiang Wang
  40. Diversity-Inducing Policy Gradient: Using Maximum Mean Discrepancy to Find a Set of Diverse Policies: Muhammad Masood, Finale Doshi-Velez
  41. A probabilistic logic for resource-bounded multi-agent systems: Hoang Nga Nguyen, Abdur Rakib
  42. A Value-based Trust Assessment Model for Multi-agent Systems: Kinzang Chhogyal, Abhaya Nayak, Aditya Ghose, Hoa Khanh Dam
  43. Branch-and-Cut-and-Price for Multi-Agent Pathfinding: Edward Lam, Pierre Le Bodic, Daniel Harabor, Peter J. Stuckey
  44. Decidability of Model Checking Multi-Agent Systems with Regular Expressions against Epistemic HS Specifications: Jakub Michaliszyn, Piotr Witkowski
  45. Improved Heuristics for Multi-Agent Path Finding with Conflict-Based Search: Jiaoyang Li, Eli Boyarski, Ariel Felner, Hang Ma, Sven Koenig
  46. Integrating Decision Sharing with Prediction in Decentralized Planning for Multi-Agent Coordination under Uncertainty: Minglong Li, Wenjing Yang, Zhongxuan Cai, Shaowu Yang, Ji Wang
  47. Multi-agent Attentional Activity Recognition: Kaixuan Chen, Lina Yao, Dalin Zhang, Bin Guo, Zhiwen Yu
  48. Multi-Agent Pathfinding with Continuous Time: Anton Andreychuk, Konstantin Yakovlev, Dor Atzmon, Roni Stern
  49. Priority Inheritance with Backtracking for Iterative Multi-agent Path Finding: Keisuke Okumura, Manao Machida, Xavier Défago, Yasumasa Tamura
  50. The Interplay of Emotions and Norms in Multiagent Systems: Anup K. Kalia, Nirav Ajmeri, Kevin S. Chan, Jin-Hee Cho, Sibel Adali, Munindar Singh
  51. Unifying Search-based and Compilation-based Approaches to Multi-agent Path Finding through Satisfiability Modulo Theories: Pavel Surynek
  52. Implicitly Coordinated Multi-Agent Path Finding under Destination Uncertainty: Success Guarantees and Computational Complexity (Extended Abstract): Bernhard Nebel, Thomas Bolander, Thorsten Engesser, Robert Mattmüller
  53. Embodied Conversational AI Agents in a Multi-modal Multi-agent Competitive Dialogue: Rahul Divekar, Xiangyang Mou, Lisha Chen, Maíra Gatti de Bayser, Melina Alberio Guerra, Hui Su
  54. Multi-Agent Path Finding on Ozobots: Roman Barták, Ivan Krasičenko, Jiří Švancara
  55. Multi-Agent Visualization for Explaining Federated Learning: Xiguang Wei, Quan Li, Yang Liu, Han Yu, Tianjian Chen, Qiang Yang
  56. Automated Machine Learning with Monte-Carlo Tree Search: Herilalaina Rakotoarison, Marc Schoenauer, Michele Sebag
  57. Influence of State-Variable Constraints on Partially Observable Monte Carlo Planning: Alberto Castellini, Georgios Chalkiadakis, Alessandro Farinelli
  58. Multiple Policy Value Monte Carlo Tree Search: Li-Cheng Lan, Wei Li, Ting han Wei, I-Chen Wu
  59. Subgoal-Based Temporal Abstraction in Monte-Carlo Tree Search: Thomas Gabor, Jan Peter, Thomy Phan, Christian Meyer, Claudia Linnhoff-Popien
  60. A Convergence Analysis of Distributed SGD with Communication-Efficient Gradient Sparsification: Shaohuai Shi, Kaiyong Zhao, Qiang Wang, Zhenheng Tang, Xiaowen Chu
  61. AsymDPOP: Complete Inference for Asymmetric Distributed Constraint Optimization Problems: Yanchen Deng, Ziyu Chen, Dingding Chen, Wenxin Zhang, Xingqiong Jiang
  62. Distributed Collaborative Feature Selection Based on Intermediate Representation: Xiucai Ye, Hongmin Li, Akira Imakura, Tetsuya Sakurai
  63. FABA: An Algorithm for Fast Aggregation against Byzantine Attacks in Distributed Neural Networks: Qi Xia, Zeyi Tao, Zijiang Hao, Qun Li
  64. Faster Distributed Deep Net Training: Computation and Communication Decoupled Stochastic Gradient Descent: Shuheng Shen, Linli Xu, Jingchang Liu, Xianfeng Liang, Yifei Cheng
  65. Fully Distributed Bayesian Optimization with Stochastic Policies: Javier Garcia-Barcos, Ruben Martinez-Cantin

Github链接

https://github.com/NeuronDance/DeepRL/tree/master/DRL-ConferencePaper/IJCAI



专知便捷查看

便捷下载,请关注专知公众号(点击上方蓝色专知关注)

  • 后台回复“DRLA” 就可以获取【Manning2020新书】深度强化学习实战,351页pdf,Deep Reinforcement Learning》论文专知下载链接

专知,专业可信的人工智能知识分发,让认知协作更快更好!欢迎注册登录专知www.zhuanzhi.ai,获取5000+AI主题干货知识资料!
欢迎微信扫一扫加入专知人工智能知识星球群,获取最新AI专业干货知识教程资料和与专家交流咨询
点击“ 阅读原文 ”,了解使用 专知 ,查看获取5000+AI主题知识资源
登录查看更多
8

相关内容

强化学习(RL)是机器学习的一个领域,与软件代理应如何在环境中采取行动以最大化累积奖励的概念有关。除了监督学习和非监督学习外,强化学习是三种基本的机器学习范式之一。 强化学习与监督学习的不同之处在于,不需要呈现带标签的输入/输出对,也不需要显式纠正次优动作。相反,重点是在探索(未知领域)和利用(当前知识)之间找到平衡。 该环境通常以马尔可夫决策过程(MDP)的形式陈述,因为针对这种情况的许多强化学习算法都使用动态编程技术。经典动态规划方法和强化学习算法之间的主要区别在于,后者不假设MDP的确切数学模型,并且针对无法采用精确方法的大型MDP。

知识荟萃

精品入门和进阶教程、论文和代码整理等

更多

查看相关VIP内容、论文、资讯等
专知会员服务
60+阅读 · 2020年3月19日
近期必读的5篇AI顶会CVPR 2020 GNN (图神经网络) 相关论文
专知会员服务
78+阅读 · 2020年3月3日
抢鲜看!13篇CVPR2020论文链接/开源代码/解读
专知会员服务
49+阅读 · 2020年2月26日
深度强化学习策略梯度教程,53页ppt
专知会员服务
178+阅读 · 2020年2月1日
Stabilizing Transformers for Reinforcement Learning
专知会员服务
59+阅读 · 2019年10月17日
强化学习最新教程,17页pdf
专知会员服务
174+阅读 · 2019年10月11日
17篇必看[知识图谱Knowledge Graphs] 论文@AAAI2020
ICCV 2019 行为识别/视频理解论文汇总
极市平台
15+阅读 · 2019年9月26日
NeurIPS2019机器学习顶会接受论文列表!
GAN生成式对抗网络
17+阅读 · 2019年9月6日
ICML2019机器学习顶会接受论文列表!
专知
10+阅读 · 2019年5月12日
OpenAI丨深度强化学习关键论文列表
中国人工智能学会
17+阅读 · 2018年11月10日
【OpenAI】深度强化学习关键论文列表
专知
11+阅读 · 2018年11月10日
【NIPS2018】接收论文列表
专知
5+阅读 · 2018年9月10日
COLING 2018-最新论文最全分类-整理分享
深度学习与NLP
6+阅读 · 2018年7月6日
人工智能领域顶会IJCAI 2018 接受论文列表
专知
5+阅读 · 2018年5月16日
Risk-Aware Active Inverse Reinforcement Learning
Arxiv
7+阅读 · 2019年1月8日
Arxiv
7+阅读 · 2018年12月26日
Arxiv
7+阅读 · 2018年8月21日
Arxiv
12+阅读 · 2018年1月28日
Arxiv
5+阅读 · 2017年12月14日
VIP会员
相关VIP内容
专知会员服务
60+阅读 · 2020年3月19日
近期必读的5篇AI顶会CVPR 2020 GNN (图神经网络) 相关论文
专知会员服务
78+阅读 · 2020年3月3日
抢鲜看!13篇CVPR2020论文链接/开源代码/解读
专知会员服务
49+阅读 · 2020年2月26日
深度强化学习策略梯度教程,53页ppt
专知会员服务
178+阅读 · 2020年2月1日
Stabilizing Transformers for Reinforcement Learning
专知会员服务
59+阅读 · 2019年10月17日
强化学习最新教程,17页pdf
专知会员服务
174+阅读 · 2019年10月11日
相关资讯
17篇必看[知识图谱Knowledge Graphs] 论文@AAAI2020
ICCV 2019 行为识别/视频理解论文汇总
极市平台
15+阅读 · 2019年9月26日
NeurIPS2019机器学习顶会接受论文列表!
GAN生成式对抗网络
17+阅读 · 2019年9月6日
ICML2019机器学习顶会接受论文列表!
专知
10+阅读 · 2019年5月12日
OpenAI丨深度强化学习关键论文列表
中国人工智能学会
17+阅读 · 2018年11月10日
【OpenAI】深度强化学习关键论文列表
专知
11+阅读 · 2018年11月10日
【NIPS2018】接收论文列表
专知
5+阅读 · 2018年9月10日
COLING 2018-最新论文最全分类-整理分享
深度学习与NLP
6+阅读 · 2018年7月6日
人工智能领域顶会IJCAI 2018 接受论文列表
专知
5+阅读 · 2018年5月16日
相关论文
Top
微信扫码咨询专知VIP会员