以最佳路程长的遗憾进行在线估计和控制 (Online estimation and control with optimal pathlength regret) - 专知论文

会员服务 ·

0

估计/估计量 · 控制器 · 可约的 · 优化器 · 回合 ·

2021 年 12 月 7 日

Online estimation and control with optimal pathlength regret

翻译：以最佳路程长的遗憾进行在线估计和控制

Gautam Goel,Babak Hassibi

A natural goal when designing online learning algorithms for non-stationary environments is to bound the regret of the algorithm in terms of the temporal variation of the input sequence. Intuitively, when the variation is small, it should be easier for the algorithm to achieve low regret, since past observations are predictive of future inputs. Such data-dependent "pathlength" regret bounds have recently been obtained for a wide variety of online learning problems, including OCO and bandits. We obtain the first pathlength regret bounds for online control and estimation (e.g. Kalman filtering) in linear dynamical systems. The key idea in our derivation is to reduce pathlength-optimal filtering and control to certain variational problems in robust estimation and control; these reductions may be of independent interest. Numerical simulations confirm that our pathlength-optimal algorithms outperform traditional $H_2$ and $H_{\infty}$ algorithms when the environment varies over time.

翻译：在设计非静止环境的在线学习算法时,一个自然目标是将算法的遗憾与输入序列的时间变异联系起来。直观地说,如果变异较小,算法应更容易实现低遗憾, 因为过去的观测是预测未来投入的。最近为包括OCO和土匪在内的广泛的在线学习问题获得了这类数据依赖的“ 病态” 遗憾界限。我们在线性动态系统中获得了第一个线性控制和估算(例如,Kalman过滤)的路径长遗憾界限。我们的衍生关键思想是减少路径长的过滤和控制,以适应在稳健估计和控制方面的某些变异问题; 这些减法可能是独立的。数字模拟证实,当环境随时间变化时,我们的路径长-最佳算法比传统的$H_2美元和$H ⁇ infty} 。

0

相关内容

估计/估计量

估计/估计量

【经典书】计算最优传输，209页pdf，Computational Optimal Transport

【经典书】计算最优传输，209页pdf，Computational Optimal Transport

专知会员服务

75+阅读 · 2021年1月10日

【三维物体和手部姿态估计】综述论文最新进展，Recent Advances in 3D Object and Hand Pose Estimation

【三维物体和手部姿态估计】综述论文最新进展，Recent Advances in 3D Object and Hand Pose Estimation

专知会员服务

21+阅读 · 2020年6月13日

深度强化学习策略梯度教程，53页ppt

深度强化学习策略梯度教程，53页ppt

专知会员服务

184+阅读 · 2020年2月1日

【电子书推荐】强化学习（Reinforcement Learning）法兰克福大学 | Cornelius Weber

【电子书推荐】强化学习（Reinforcement Learning）法兰克福大学 | Cornelius Weber

专知会员服务

44+阅读 · 2019年11月19日

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

10+阅读 · 2019年10月24日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

【经典书】计算最优传输，209页pdf，Computational Optimal Transport

【经典书】计算最优传输，209页pdf，Computational Optimal Transport

专知

16+阅读 · 2021年1月10日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

人体姿态估计资源大列表（Human Pose Estimation）

人体姿态估计资源大列表（Human Pose Estimation）

专知

9+阅读 · 2018年10月6日

【NIPS2018】接收论文列表

【NIPS2018】接收论文列表

专知

5+阅读 · 2018年9月10日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【推荐】SVM实例教程

【推荐】SVM实例教程

机器学习研究会

17+阅读 · 2017年8月26日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

强化学习 cartpole_a3c

强化学习 cartpole_a3c

CreateAMind

9+阅读 · 2017年7月21日

Efficient approximation of cardiac mechanics through reduced order modeling with deep learning-based operator approximation

Arxiv

0+阅读 · 2022年2月8日

Online error control for platform trials

Arxiv

0+阅读 · 2022年2月8日

Approximation Algorithms for ROUND-UFP and ROUND-SAP

Arxiv

0+阅读 · 2022年2月7日

Policy Optimization for Stochastic Shortest Path

Arxiv

0+阅读 · 2022年2月7日

Deep Learning based Channel Estimation for Massive MIMO with Hybrid Transceivers

Arxiv

0+阅读 · 2022年2月7日

The Hurst roughness exponent and its model-free estimation

Arxiv

0+阅读 · 2022年2月7日

Convergence of a robust deep FBSDE method for stochastic control

Arxiv

0+阅读 · 2022年2月6日

Tight Mutual Information Estimation With Contrastive Fenchel-Legendre Optimization

Arxiv

0+阅读 · 2022年2月6日

Variable-Length Stop-Feedback Codes With Finite Optimal Decoding Times for BI-AWGN Channels

Arxiv

0+阅读 · 2022年2月5日

Fast Monte-Carlo Approximation of the Attention Mechanism

Arxiv

8+阅读 · 2022年1月30日

VIP会员

文章信息

相关主题

估计/估计量

相关VIP内容

【经典书】计算最优传输，209页pdf，Computational Optimal Transport

【经典书】计算最优传输，209页pdf，Computational Optimal Transport

专知会员服务

75+阅读 · 2021年1月10日

【三维物体和手部姿态估计】综述论文最新进展，Recent Advances in 3D Object and Hand Pose Estimation

【三维物体和手部姿态估计】综述论文最新进展，Recent Advances in 3D Object and Hand Pose Estimation

专知会员服务

21+阅读 · 2020年6月13日

深度强化学习策略梯度教程，53页ppt

深度强化学习策略梯度教程，53页ppt

专知会员服务

184+阅读 · 2020年2月1日

【电子书推荐】强化学习（Reinforcement Learning）法兰克福大学 | Cornelius Weber

【电子书推荐】强化学习（Reinforcement Learning）法兰克福大学 | Cornelius Weber

专知会员服务

44+阅读 · 2019年11月19日

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

10+阅读 · 2019年10月24日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【伯克利博士论文】通过真实世界实践赋能机器人自主性

军用无人机集群技术尚未成熟——但潜力可期

人工智能安全治理白皮书（2025）

AgentOps综述：分类、挑战与未来方向

相关资讯

【经典书】计算最优传输，209页pdf，Computational Optimal Transport

【经典书】计算最优传输，209页pdf，Computational Optimal Transport

专知

16+阅读 · 2021年1月10日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

人体姿态估计资源大列表（Human Pose Estimation）

人体姿态估计资源大列表（Human Pose Estimation）

专知

9+阅读 · 2018年10月6日

【NIPS2018】接收论文列表

【NIPS2018】接收论文列表

专知

5+阅读 · 2018年9月10日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【推荐】SVM实例教程

【推荐】SVM实例教程

机器学习研究会

17+阅读 · 2017年8月26日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

强化学习 cartpole_a3c

强化学习 cartpole_a3c

CreateAMind

9+阅读 · 2017年7月21日

相关论文

Efficient approximation of cardiac mechanics through reduced order modeling with deep learning-based operator approximation

Arxiv

0+阅读 · 2022年2月8日

Online error control for platform trials

Arxiv

0+阅读 · 2022年2月8日

Approximation Algorithms for ROUND-UFP and ROUND-SAP

Arxiv

0+阅读 · 2022年2月7日

Policy Optimization for Stochastic Shortest Path

Arxiv

0+阅读 · 2022年2月7日

Deep Learning based Channel Estimation for Massive MIMO with Hybrid Transceivers

Arxiv

0+阅读 · 2022年2月7日

The Hurst roughness exponent and its model-free estimation

Arxiv

0+阅读 · 2022年2月7日

Convergence of a robust deep FBSDE method for stochastic control

Arxiv

0+阅读 · 2022年2月6日

Tight Mutual Information Estimation With Contrastive Fenchel-Legendre Optimization

Arxiv

0+阅读 · 2022年2月6日

Variable-Length Stop-Feedback Codes With Finite Optimal Decoding Times for BI-AWGN Channels

Arxiv

0+阅读 · 2022年2月5日

Fast Monte-Carlo Approximation of the Attention Mechanism

Arxiv

8+阅读 · 2022年1月30日

微信扫码咨询专知VIP会员