深度强化学习实验室报道
来源:AAAI-2020
作者:DeepRL
AAAI 2020 共收到的有效论文投稿超过 8800 篇,其中 7737 篇论文进入评审环节,最终收录数量为 1591 篇,收录率为 20.6%,而被接受论文列表中强化学习有52+篇,录取比约为3%,其中接收论文中就单位而言:Google Brain, DeepMind, Tsinghua University,UCL,Tencent AI Lab,Peking University, IBM, FaceBook等被录取一大片,就作者而言,不但有强化学习老爷子Sutton的文章(第48篇),也有后起之秀等。论文涉及了环境、理论算法、应用以及多智能体等各个方向。以下是详细列表:
Karol Kurach (Google Brain)*; Anton Raichuk (Google); Piotr Stańczyk (Google Brain); Michał Zając (Google Brain); Olivier Bachem (Google Brain); Lasse Espeholt (DeepMind); Carlos Riquelme (Google Brain); Damien Vincent (Google Brain); Marcin Michalski (Google); Olivier Bousquet (Google); Sylvain Gelly (Google Brain)
Xiaojian Ma (University of California, Los Angeles)*; Mingxuan Jing (Tsinghua University); Wenbing Huang (Tsinghua University); Chao Yang (Tsinghua University); Fuchun Sun (Tsinghua); Huaping Liu (Tsinghua University); Bin Fang (Tsinghua University)
Cristian Bodnar (University of Cambridge)*; Ben Day (University of Cambridge); Pietro Lió (University of Cambridge)
Jie Wu (Sun Yat-sen University)*; Guanbin Li (Sun Yat-sen University); si liu (Beihang University); Liang Lin (DarkMatter AI)
Nan Jiang (Tsinghua University)*; Sheng Jin (Tsinghua University); Zhiyao Duan (Unversity of Rochester); Changshui Zhang (Tsinghua University)
Deheng Ye (Tencent)*; Zhao Liu (Tencent); Mingfei Sun (Tencent); Bei Shi (Tencent AI Lab); Peilin Zhao (Tencent AI Lab); Hao Wu (Tencent); Hongsheng Yu (Tencent); Shaojie Yang (Tencent); Xipeng Wu (Tencent); Qingwei Guo (Tsinghua University); Qiaobo Chen (Tencent); Yinyuting Yin (Tencent); Hao Zhang (Tencent); Tengfei Shi (Tencent); Liang Wang (Tencent); Qiang Fu (Tencent AI Lab); Wei Yang (Tencent AI Lab); Lanxiao Huang (Tencent)
Nicolas Anastassacos (The Alan Turing Institute)*; Steve Hailes (University College London); Mirco Musolesi (UCL)
Felipe Leno da Silva (University of Sao Paulo)*; Pablo Hernandez-Leal (Borealis AI); Bilal Kartal (Borealis AI); Matthew Taylor (Borealis AI)
Xinshi Zang (Shanghai Jiao Tong University)*; Huaxiu Yao (Pennsylvania State University); Guanjie Zheng (Pennsylvania State University); Nan Xu (University of Southern California); Kai Xu (Shanghai Tianrang Intelligent Technology Co., Ltd); Zhenhui (Jessie) Li (Penn State University)
Yang Liu (University of Science and Technology of China)*; Qi Liu (" University of Science and Technology of China, China"); Hongke Zhao (Tianjin University); Zhen Pan (University of Science and Technology of China); Chuanren Liu (The University of Tennessee Knoxville)
Hangyu Mao (Peking University)*; Wulong Liu (Huawei Noah's Ark Lab); Jianye Hao (Tianjin University); Jun Luo (Huawei Technologies Canada Co. Ltd.); Dong Li ( Huawei Noah's Ark Lab); Zhengchao Zhang (Peking University); Jun Wang (UCL); Zhen Xiao (Peking University)
Chao Wen (Nanjing University of Aeronautics and Astronautics)*; Xinghu Yao (Nanjing University of Aeronautics and Astronautics); Yuhui Wang (Nanjing University of Aeronautics and Astronautics, China); Xiaoyang Tan (Nanjing University of Aeronautics and Astronautics, China)
Satoshi Kosugi (The University of Tokyo)*; Toshihiko Yamasaki (The University of Tokyo)
Jun Wang (University of Science and Technology of China)*; Hefu Zhang (University of Science and Technology of China); Qi Liu (" University of Science and Technology of China, China"); Zhen Pan (University of Science and Technology of China); Hanqing Tao (University of Science and Technology of China (USTC))
Wenjie Huang (Shenzhen Research Institute of Big Data)*; Hai Pham Viet (Department of Computer Science, School of Computing, National University of Singapore); William Benjamin Haskell (Supply Chain and Operations Management Area, Krannert School of Management, Purdue University)
Liang Tong (Washington University in Saint Louis)*; Aron Laszka (University of Houston); Chao Yan (Vanderbilt UNIVERSITY); Ning Zhang (Washington University in St. Louis); Yevgeniy Vorobeychik (Washington University in St. Louis)
Chacha Chen (Pennsylvania State University)*; Hua Wei (Pennsylvania State University); Nan Xu (University of Southern California); Guanjie Zheng (Pennsylvania State University); Ming Yang (Shanghai Tianrang Intelligent Technology Co., Ltd); Yuanhao Xiong (Zhejiang University); Kai Xu (Shanghai Tianrang Intelligent Technology Co., Ltd); Zhenhui (Jessie) Li (Penn State University)
Erik Gärtner (Lund University)*; Aleksis Pirinen (Lund University); Cristian Sminchisescu (Lund University)
Min Yang ( Chinese Academy of Sciences)*; Chengming Li (Chinese Academy of Sciences); Fei Sun (Alibaba Group); Zhou Zhao (Zhejiang University); Ying Shen (Peking University Shenzhen Graduate School); Chenglin Wu (fuzhi.ai)
Gal Dalal (Technion)*; Balazs Szorenyi (Yahoo Research); Gugan Thoppe (Duke University)
Jingkang Wang (University of Toronto); Yang Liu (UCSC); Bo Li (University of Illinois at Urbana–Champaign)*
Thomas Barrett (University of Oxford)*; William Clements (Unchartech); Jakob Foerster (Facebook AI Research); Alexander Lvovsky (Oxford University)
Vishal Jain (Mila, McGill University)*; Liam Fedus (Google); Hugo Larochelle (Google); Doina Precup (McGill University); Marc G. Bellemare (Google Brain)
Xian Yeow Lee (Iowa State University)*; Sambit Ghadai (Iowa State University); Kai Liang Tan (Iowa State University); Chinmay Hegde (New York University); Soumik Sarkar (Iowa State University)
MAHTAB AHMED (The University of Western Ontario)*; Robert Mercer (The University of Western Ontario)
Arthur Williams (Middle Tennessee State University)*; Joshua Phillips (Middle Tennessee State University)
Yunan Ye (Zhejiang University)*; Hengzhi Pei (Fudan University); Boxin Wang (University of Illinois at Urbana- Champaign); Pin-Yu Chen (IBM Research); Yada Zhu (IBM Research); Jun Xiao (Zhejiang University); Bo Li (University of Illinois at Urbana–Champaign)
Adrian Goldwaser (University of New South Wales)*; Michael Thielscher (University of New South Wales)
Jianwen Sun (Nanyang Technological University)*; Tianwei Zhang ( Nanyang Technological University); Xiaofei Xie (Nanyang Technological University); Lei Ma (Kyushu University); Yan Zheng (Tianjin University); Kangjie Chen (Tianjin University); Yang Liu (Nanyang Technology University, Singapore)
Leonard Adolphs (ETHZ)*; Thomas Hofmann (ETH Zurich)
Daniel Furelos-Blanco (Imperial College London)*; Mark Law (Imperial College London); Alessandra Russo (Imperial College London); Krysia Broda (Imperial College London); Anders Jonsson (UPF)
wentian li (Tsinghua University)*; XIDONG FENG (department of Automation,Tsinghua University); Haotian An (Tsinghua University); Xiang Yao Ng (Tsinghua University); Yu-Jin Zhang (Tsinghua University)
Prashan Madumal (University of Melbourne)*; Tim Miller (University of Melbourne); Liz Sonenberg (University of Melbourne); Frank Vetere (University of Melbourne)
Guojia Wan (Wuhan University); Bo Du (School of Compuer Science, Wuhan University)*; Shirui Pan (Monash University); Reza Haffari (Monash University, Australia)
Yash Chandak (University of Massachusetts Amherst)*; Georgios Theocharous ("Adobe Research, USA"); Blossom Metevier (University of Massachusetts, Amherst); Philip Thomas (University of Massachusetts Amherst)
Weiran Shen (Carnegie Mellon University)*; Binghui Peng (Columbia University); Hanpeng Liu (Tsinghua University); Michael Zhang (Chinese University of Hong Kong); Ruohan Qian (Baidu Inc.); Yan Hong (Baidu Inc.); Zhi Guo (Baidu Inc.); Zongyao Ding (Baidu Inc.); Pengjun Lu (Baidu Inc.); Pingzhong Tang (Tsinghua University)
Rich Contextual Representations Aditya Modi (Univ. of Michigan Ann Arbor)*; Debadeepta Dey (Microsoft); Alekh Agarwal (Microsoft); Adith Swaminathan (Microsoft Research); Besmira Nushi (Microsoft Research); Sean Andrist (Microsoft Research); Eric Horvitz (MSR)
Ya Xiao (Tongji University)*; Chengxiang Tan (Tongji University); Zhijie Fan (The Third Research Institute of the Ministry of Public Security); Qian Xu (Tongji University); Wenye Zhu (Tongji University)
Tomas Brazdil (Masaryk University); Krishnendu Chatterjee (IST Austria); Petr Novotný (Masaryk University)*; Jiří Vahala (Masaryk University)
Qi Zhou (University of Science and Technology of China); Houqiang Li (University of Science and Technology of China); Jie Wang (University of Science and Technology of China)*
Maor Gaon (Ben-Gurion University); Ronen Brafman (BGU)*
Julian Whitman (Carnegie Mellon University)*; Raunaq Bhirangi (Carnegie Mellon University); Matthew Travers (CMU); Howie Choset (Carnegie Melon University)
Morgane Ayle (American University of Beirut - AUB)*; Jimmy Tekli (BMW Group / Université de Franche-Comté - UFC); Julia Zini (American University of Beirut - AUB); Boulos El Asmar (BMW Group / Karlsruher Institut für Technologie - KIT); Mariette Awad (American University of Beirut- AUB)
Abdelrhman Saleh (Harvard University)*; Natasha Jaques (MIT); Asma Ghandeharioun (MIT); Judy Hanwen Shen(MIT); Rosalind Picard (MIT Media Lab)
Liqiang Xiao (Artificial Intelligence Institute, SJTU)*; Lu Wang (Khoury College of Computer Science, Northeastern University); Hao He (Shanghai Jiao Tong University); Yaohui Jin (Artificial Intelligence Institute, SJTU)
Xiang Ni (IBM Research); Jing Li (NJIT); Wang Zhou (IBM Research); Mo Yu (IBM T. J. Watson)*; Kun-Lung Wu (IBM Research)
Yu Wang (Microsoft)*; Jack Stokes (Microsoft Research); Mady Marinescu (Microsoft Corporation)
Kristopher De Asis (University of Alberta)*; Alan Chan (University of Alberta); Silviu Pitis (University of Toronto); Richard Sutton (University of Alberta) ; Daniel Graves (Huawei)
Liqun Chen (Duke University)*; Ke Bai (Duke University); Chenyang Tao (Duke University); Yizhe Zhang (Microsoft Research); Guoyin Wang (Duke University); Wenlin Wang (Duke Univeristy); Ricardo Henao (Duke University); Lawrence Carin Duke (CS)
Fabio Pardo (Imperial College London)*; Vitaly Levdik (Imperial College London); Petar Kormushev (Imperial College London)
Tian Tan (Stanford University)*; Zhihan Xiong (Stanford University); Vikranth Dwaracherla (Stanford University)
Sanket Shah (Singpore Management University)*; Arunesh Sinha (Singapore Management University); Pradeep Varakantham (Singapore Management University); Andrew Perrault (Harvard University); Milind Tambe (Harvard University)
第39篇:DQN系列(2): Double DQN 算法原理与实现
第38篇:DQN系列(1): Double Q-learning
第37篇:从Paper到Coding, 一览DRL挑战34类游戏
第36篇:复现"深度强化学习"论文的经验之谈
第35篇:α-Rank算法之DeepMind及Huawei的改进
第34篇:DeepMind-102页深度强化学习PPT(2019)
第31篇:强化学习,路在何方?
第30篇:强化学习的三种范例
第29篇:框架ES-MAML:进化策略的元学习方法
第28篇:138页“策略优化”PPT--Pieter Abbeel
第27篇:迁移学习在强化学习中的应用及最新进展
第26篇:深入理解Hindsight Experience Replay
第25篇:10项【深度强化学习】赛事汇总
第24篇:DRL实验中到底需要多少个随机种子?
第23篇:142页"ICML会议"强化学习笔记
第22篇:通过深度强化学习实现通用量子控制
第21篇:《深度强化学习》面试题汇总
第20篇:《深度强化学习》招聘汇总(13家企业)
第19篇:解决反馈稀疏问题之HER原理与代码实现
第17篇:AI Paper | 几个实用工具推荐
第16篇:AI领域:如何做优秀研究并写高水平论文?
第11期论文:2019-12-19(3篇,一篇OpennAI,一篇Nvidia)
第10期论文:2019-12-13(8篇)
第9期论文:2019-12-3(3篇)
第8期论文:2019-11-18(5篇)
第7期论文:2019-11-15(6篇)
第6期论文:2019-11-08(2篇)
第5期论文:2019-11-07(5篇,一篇DeepMind发表)
第4期论文:2019-11-05(4篇)
第3期论文:2019-11-04(6篇)
第2期论文:2019-11-03(3篇)
第1期论文:2019-11-02(5篇)