应用加强学习的超定线性系统规模变异解决方案 (Scale Invariant Solutions for Overdetermined Linear Systems with Applications to Reinforcement Learning) - 专知论文

会员服务 ·

0

尺度不变性 · 价值函数 · 线性的 · 泛函 · 规范化的 ·

2021 年 4 月 15 日

Scale Invariant Solutions for Overdetermined Linear Systems with Applications to Reinforcement Learning

翻译：应用加强学习的超定线性系统规模变异解决方案

Rahul Madhavan,Gugan Thoppe,Hemanta Makwana

from arxiv, 53 pages, 12 figures (9 pages main body with 5 figures)

Overdetermined linear systems are common in reinforcement learning, e.g., in Q and value function estimation with function approximation. The standard least-squares criterion, however, leads to a solution that is unduly influenced by rows with large norms. This is a serious issue, especially when the matrices in these systems are beyond user control. To address this, we propose a scale-invariant criterion that we then use to develop two novel algorithms for value function estimation: Normalized Monte Carlo and Normalized TD(0). Separately, we also introduce a novel adaptive stepsize that may be useful beyond this work as well. We use simulations and theoretical guarantees to demonstrate the efficacy of our ideas.

翻译：超定线性系统在强化学习中很常见,例如在Q和价值函数估算中,有功能近似值。标准最小平方标准导致一种不适当地受具有大规范的行影响的解决办法。这是一个严重的问题,特别是当这些系统中的矩阵超出用户控制范围时。为了解决这个问题,我们提出了一个规模变化性标准,然后我们用它来为价值函数估算制定两种新的算法:正常化的蒙特卡洛和正常化的TD(0),另外,我们还引入了一种新的适应步骤,这可能在这项工作之外有用。我们用模拟和理论保证来展示我们的想法的有效性。

0

相关内容

尺度不变性

尺度不变性

哥伦比亚大学最新《机器学习》课程，Fall-B 2020 (Machine Learning)

专知会员服务

39+阅读 · 2020年11月3日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

专知会员服务

115+阅读 · 2020年4月5日

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

专知会员服务

77+阅读 · 2020年2月8日

【强化学习资源集合】Awesome Reinforcement Learning

【强化学习资源集合】Awesome Reinforcement Learning

专知会员服务

97+阅读 · 2019年12月23日

【电子书】理解机器学习：从理论到算法（Understanding Machine Learning: From Theory to Algorithms）449页PDF免费下载

【电子书】理解机器学习：从理论到算法（Understanding Machine Learning: From Theory to Algorithms）449页PDF免费下载

专知会员服务

156+阅读 · 2019年10月30日

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

专知会员服务

244+阅读 · 2019年10月21日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

Reinforcement Learning: An Introduction 2018第二版 500页

Reinforcement Learning: An Introduction 2018第二版 500页

CreateAMind

14+阅读 · 2018年4月27日

Soft-NMS – Improving Object Detection With One Line of Code

Soft-NMS – Improving Object Detection With One Line of Code

统计学习与视觉计算组

6+阅读 · 2018年3月30日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

强化学习 cartpole_a3c

强化学习 cartpole_a3c

CreateAMind

9+阅读 · 2017年7月21日

Learning Markov State Abstractions for Deep Reinforcement Learning

Arxiv

0+阅读 · 2021年6月8日

Invariant Policy Learning: A Causal Perspective

Arxiv

0+阅读 · 2021年6月7日

Control-Oriented Model-Based Reinforcement Learning with Implicit Differentiation

Arxiv

0+阅读 · 2021年6月6日

Model-based Adversarial Meta-Reinforcement Learning

Arxiv

5+阅读 · 2020年6月16日

Optimization for deep learning: theory and algorithms

Optimization for deep learning: theory and algorithms

Arxiv

106+阅读 · 2019年12月19日

Learning Tree-based Deep Model for Recommender Systems

Arxiv

3+阅读 · 2018年12月21日

A Multi-Objective Deep Reinforcement Learning Framework

A Multi-Objective Deep Reinforcement Learning Framework

Arxiv

16+阅读 · 2018年6月27日

Reinforcement Learning for Solving the Vehicle Routing Problem

Arxiv

3+阅读 · 2018年5月21日

Safety-aware Adaptive Reinforcement Learning with Applications to Brushbot Navigation

Arxiv

4+阅读 · 2018年1月29日

Active Learning from Positive and Unlabeled Data

Arxiv

3+阅读 · 2016年2月24日

VIP会员

文章信息

相关主题

尺度不变性

相关VIP内容

哥伦比亚大学最新《机器学习》课程，Fall-B 2020 (Machine Learning)

专知会员服务

39+阅读 · 2020年11月3日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

专知会员服务

115+阅读 · 2020年4月5日

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

专知会员服务

77+阅读 · 2020年2月8日

【强化学习资源集合】Awesome Reinforcement Learning

【强化学习资源集合】Awesome Reinforcement Learning

专知会员服务

97+阅读 · 2019年12月23日

【电子书】理解机器学习：从理论到算法（Understanding Machine Learning: From Theory to Algorithms）449页PDF免费下载

【电子书】理解机器学习：从理论到算法（Understanding Machine Learning: From Theory to Algorithms）449页PDF免费下载

专知会员服务

156+阅读 · 2019年10月30日

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

专知会员服务

244+阅读 · 2019年10月21日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

热门VIP内容

开通专知VIP会员享更多权益服务

《复杂工程系统模型驱动设计决策支持系统：早期设计阶段挑战》最新138页

《日本陆上自卫队2040年作战方式与未来作战研究》最新23页slides

人工智能作为战争武器

《后勤保障》最新23页

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

Reinforcement Learning: An Introduction 2018第二版 500页

Reinforcement Learning: An Introduction 2018第二版 500页

CreateAMind

14+阅读 · 2018年4月27日

Soft-NMS – Improving Object Detection With One Line of Code

Soft-NMS – Improving Object Detection With One Line of Code

统计学习与视觉计算组

6+阅读 · 2018年3月30日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

强化学习 cartpole_a3c

强化学习 cartpole_a3c

CreateAMind

9+阅读 · 2017年7月21日

相关论文

Learning Markov State Abstractions for Deep Reinforcement Learning

Arxiv

0+阅读 · 2021年6月8日

Invariant Policy Learning: A Causal Perspective

Arxiv

0+阅读 · 2021年6月7日

Control-Oriented Model-Based Reinforcement Learning with Implicit Differentiation

Arxiv

0+阅读 · 2021年6月6日

Model-based Adversarial Meta-Reinforcement Learning

Arxiv

5+阅读 · 2020年6月16日

Optimization for deep learning: theory and algorithms

Optimization for deep learning: theory and algorithms

Arxiv

106+阅读 · 2019年12月19日

Learning Tree-based Deep Model for Recommender Systems

Arxiv

3+阅读 · 2018年12月21日

A Multi-Objective Deep Reinforcement Learning Framework

A Multi-Objective Deep Reinforcement Learning Framework

Arxiv

16+阅读 · 2018年6月27日

Reinforcement Learning for Solving the Vehicle Routing Problem

Arxiv

3+阅读 · 2018年5月21日

Safety-aware Adaptive Reinforcement Learning with Applications to Brushbot Navigation

Arxiv

4+阅读 · 2018年1月29日

Active Learning from Positive and Unlabeled Data

Arxiv

3+阅读 · 2016年2月24日

微信扫码咨询专知VIP会员