经典论文奖:Lin Xiao:Dual Averaging Method for Regularized Stochastic Learning and Online Optimization NIPS 2009: 2116-2124
杰出新方向论文奖:Uniform convergence maybe unable to explain generalization in deep learning
最佳论文:Ilias Diakonikolas, Themis Gouleakis, Christos Tzamos:Distribution-Independent PAC Learning of Halfspaces with Massart Noise
Su Young Lee, Sung-Ik Choi, Sae-Young Chung: Sample-Efficient Deep Reinforcement Learning via Episodic Backward Update
Xiuyuan Lu, Benjamin Van Roy:Information-Theoretic Confidence Bounds for Reinforcement Learning
Zihan Zhang, Xiangyang Ji:Regret Minimization for Reinforcement Learning by Evaluating the Optimal Bias Function
Simon Ramstedt, Christopher J. Pal:Real-Time Reinforcement Learning
Ming Yu, Zhuoran Yang, Mladen Kolar, Zhaoran Wang:Convergent Policy Optimization for Safe Reinforcement Learning
Nathan Kallus, Masatoshi Uehara:Intrinsically Efficient, Stable, and Bounded Off-Policy Evaluation for Reinforcement Learning
Marc G. Bellemare, Will Dabney, Robert Dadashi, Adrien Ali Taïga, Pablo Samuel Castro, Nicolas Le Roux, Dale Schuurmans, Tor Lattimore, Clare Lyle:A Geometric Perspective on Optimal Representations for Reinforcement Learning
Harsh Gupta, R. Srikant, Lei Ying:Finite-Time Performance Bounds and Adaptive Learning Rate Selection for Two Time-Scale Reinforcement Learning
Ben Deverett, Ryan Faulkner, Meire Fortunato, Greg Wayne, Joel Z. Leibo:Interval Timing in Deep Reinforcement Learning Agents
Erwan Lecarpentier, Emmanuel Rachelson:Non-Stationary Markov Decision Processes, a Worst-Case Approach using Model-Based Reinforcement Learning
Marginalized Off-Policy Evaluation for Reinforcement Learning
Wenjie Shi, Shiji Song, Hui Wu, Ya-Chu Hsu, Cheng Wu, Gao Huang:Regularized Anderson Acceleration for Off-Policy Deep Reinforcement Learning
Yonathan Efroni, Nadav Merlis, Mohammad Ghavamzadeh, Shie Mannor:Tight Regret Bounds for Model-Based Reinforcement Learning with Greedy Policies
Sobhan Miryoosefi, Kianté Brantley, Hal Daumé III, Miroslav Dudík:Reinforcement Learning with Convex Constraints
Bastian Alt, Adrian Sosic, Heinz Koeppl:Correlation Priors for Reinforcement Learning
Yuzhe Ma, Xuezhou Zhang, Wen Sun, Xiaojin Zhu:Policy Poisoning in Batch Reinforcement Learning and Control
Abhinav Verma, Hoang Minh Le, Yisong Yue, Swarat Chaudhuri:Imitation-Projected Policy Gradient for Programmatic Reinforcement Learning