减少预测模型分层抽样培训 (Variance Reduced Training with Stratified Sampling for Forecasting Models) - 专知论文

会员服务 ·

0

可约的 · 分层采样 · 方差 · 优化器 · MoDELS ·

2021 年 6 月 11 日

Variance Reduced Training with Stratified Sampling for Forecasting Models

翻译：减少预测模型分层抽样培训

Yucheng Lu,Youngsuk Park,Lifan Chen,Yuyang Wang,Christopher De Sa,Dean Foster

In large-scale time series forecasting, one often encounters the situation where the temporal patterns of time series, while drifting over time, differ from one another in the same dataset. In this paper, we provably show under such heterogeneity, training a forecasting model with commonly used stochastic optimizers (e.g. SGD) potentially suffers large variance on gradient estimation, and thus incurs long-time training. We show that this issue can be efficiently alleviated via stratification, which allows the optimizer to sample from pre-grouped time series strata. For better trading-off gradient variance and computation complexity, we further propose SCott (Stochastic Stratified Control Variate Gradient Descent), a variance reduced SGD-style optimizer that utilizes stratified sampling via control variate. In theory, we provide the convergence guarantee of SCott on smooth non-convex objectives. Empirically, we evaluate SCott and other baseline optimizers on both synthetic and real-world time series forecasting problems, and demonstrate SCott converges faster with respect to both iterations and wall clock time.

翻译：在大规模的时间序列预测中,人们常常遇到时间序列的时间模式在时间序列随时间流而变化,在同一数据集中彼此不同的情况。在本文中,我们可以想象地显示,在这种异质下,我们用常用的随机优化器(如SGD)培训一种预测模型,这种模型有可能在梯度估计方面有很大差异,从而需要长期培训。我们表明,这一问题可以通过分层来有效缓解,通过分层使预分组时间序列层的样本能够优化。为了更好地交换梯度差异和计算复杂性,我们进一步提议了Sott(Stochacistic Sentrication Variate Egradient Emple),一种差异缩小的SGD型优化器,通过控制变异性来利用分层取样。理论上,我们为平稳的非康维克斯目标提供了Sott的趋同性保证。我们从考虑的角度评价合成和实际时间序列的Sott和其他基线优化器,并展示Stott在循环和墙时钟两方面的趋同速度。

0

相关内容

可约的

【2021新书】机器学习超参数优化，177页pdf

【2021新书】机器学习超参数优化，177页pdf

专知会员服务

163+阅读 · 2021年5月18日

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

专知会员服务

69+阅读 · 2021年3月27日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

神经网络序列数据建模，229页ppt，Modeling Sequential Data with Neural Nets

神经网络序列数据建模，229页ppt，Modeling Sequential Data with Neural Nets

专知会员服务

67+阅读 · 2020年7月25日

【KDD2020】具有条件公平性的算法决策，Algorithmic Decision Making with Conditional Fairness

【KDD2020】具有条件公平性的算法决策，Algorithmic Decision Making with Conditional Fairness

专知会员服务

22+阅读 · 2020年6月19日

最新《机器学习最优化》课程笔记，36页pdf，Optimization for Machine Learning

专知会员服务

170+阅读 · 2020年5月10日

【google】监督对比学习，Supervised Contrastive Learning

【google】监督对比学习，Supervised Contrastive Learning

专知会员服务

32+阅读 · 2020年4月23日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

PyTorch Parallel Training（单机多卡并行、混合精度、同步BN训练指南文档）

PyTorch Parallel Training（单机多卡并行、混合精度、同步BN训练指南文档）

CVer

21+阅读 · 2020年6月20日

19篇ICML2019论文摘录选读！

19篇ICML2019论文摘录选读！

专知

28+阅读 · 2019年4月28日

Github项目推荐 | 股市预测的机器学习/深度学习模型/资源集锦

Github项目推荐 | 股市预测的机器学习/深度学习模型/资源集锦

AI研习社

32+阅读 · 2019年4月18日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

【NIPS2018】接收论文列表

【NIPS2018】接收论文列表

专知

5+阅读 · 2018年9月10日

【论文推荐】最新7篇条件随机场（CRF）相关论文—图像标注、对抗学习、端到端、注意力机制、三维人体姿态、图像分割、行为分割和识别

【论文推荐】最新7篇条件随机场（CRF）相关论文—图像标注、对抗学习、端到端、注意力机制、三维人体姿态、图像分割、行为分割和识别

专知

15+阅读 · 2018年2月13日

算法｜随机森林（Random Forest）

算法｜随机森林（Random Forest）

全球人工智能

3+阅读 · 2018年1月8日

【推荐】(Keras)LSTM多元时序预测教程

【推荐】(Keras)LSTM多元时序预测教程

机器学习研究会

24+阅读 · 2017年8月14日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

多高的AUC才算高？

多高的AUC才算高？

ResysChina

7+阅读 · 2016年12月7日

Minimax and pointwise sequential changepoint detection and identification for general stochastic models

Arxiv

0+阅读 · 2021年8月11日

Bayesian functional graphical models

Arxiv

1+阅读 · 2021年8月11日

A deep generative model for probabilistic energy forecasting in power systems: normalizing flows

Arxiv

0+阅读 · 2021年8月10日

Bayesian forecast combination using time-varying features

Bayesian forecast combination using time-varying features

Arxiv

0+阅读 · 2021年8月10日

Prediction Intervals for Synthetic Control Methods

Arxiv

0+阅读 · 2021年8月8日

Informer: Beyond Efficient Transformer for Long Sequence Time-Series Forecasting

Informer: Beyond Efficient Transformer for Long Sequence Time-Series Forecasting

Arxiv

21+阅读 · 2020年12月17日

Differential Dynamic Programming Neural Optimizer

Arxiv

7+阅读 · 2020年6月29日

Minimal Variance Sampling with Provable Guarantees for Fast Training of Graph Neural Networks

Minimal Variance Sampling with Provable Guarantees for Fast Training of Graph Neural Networks

Arxiv

13+阅读 · 2020年6月24日

Multi-Range Attentive Bicomponent Graph Convolutional Network for Traffic Forecasting

Arxiv

3+阅读 · 2019年11月27日

Reducing Parameter Space for Neural Network Training

Arxiv

3+阅读 · 2018年8月17日

VIP会员

文章信息

相关主题

相关VIP内容

【2021新书】机器学习超参数优化，177页pdf

【2021新书】机器学习超参数优化，177页pdf

专知会员服务

163+阅读 · 2021年5月18日

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

专知会员服务

69+阅读 · 2021年3月27日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

神经网络序列数据建模，229页ppt，Modeling Sequential Data with Neural Nets

神经网络序列数据建模，229页ppt，Modeling Sequential Data with Neural Nets

专知会员服务

67+阅读 · 2020年7月25日

【KDD2020】具有条件公平性的算法决策，Algorithmic Decision Making with Conditional Fairness

【KDD2020】具有条件公平性的算法决策，Algorithmic Decision Making with Conditional Fairness

专知会员服务

22+阅读 · 2020年6月19日

最新《机器学习最优化》课程笔记，36页pdf，Optimization for Machine Learning

专知会员服务

170+阅读 · 2020年5月10日

【google】监督对比学习，Supervised Contrastive Learning

【google】监督对比学习，Supervised Contrastive Learning

专知会员服务

32+阅读 · 2020年4月23日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

操作系统智能体：基于多模态大模型（MLLM）的通用计算设备智能体综述

《美国太空军系统全生命周期建模、仿真与分析效能提升方案》最新84页报告

【博士论文】推进数据高效的深度学习：非参数 Transformer、主动测试与上下文学习

自主人工智能：未来战争是否将是自主化的？

相关资讯

PyTorch Parallel Training（单机多卡并行、混合精度、同步BN训练指南文档）

PyTorch Parallel Training（单机多卡并行、混合精度、同步BN训练指南文档）

CVer

21+阅读 · 2020年6月20日

19篇ICML2019论文摘录选读！

19篇ICML2019论文摘录选读！

专知

28+阅读 · 2019年4月28日

Github项目推荐 | 股市预测的机器学习/深度学习模型/资源集锦

Github项目推荐 | 股市预测的机器学习/深度学习模型/资源集锦

AI研习社

32+阅读 · 2019年4月18日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

【NIPS2018】接收论文列表

【NIPS2018】接收论文列表

专知

5+阅读 · 2018年9月10日

【论文推荐】最新7篇条件随机场（CRF）相关论文—图像标注、对抗学习、端到端、注意力机制、三维人体姿态、图像分割、行为分割和识别

【论文推荐】最新7篇条件随机场（CRF）相关论文—图像标注、对抗学习、端到端、注意力机制、三维人体姿态、图像分割、行为分割和识别

专知

15+阅读 · 2018年2月13日

算法｜随机森林（Random Forest）

算法｜随机森林（Random Forest）

全球人工智能

3+阅读 · 2018年1月8日

【推荐】(Keras)LSTM多元时序预测教程

【推荐】(Keras)LSTM多元时序预测教程

机器学习研究会

24+阅读 · 2017年8月14日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

多高的AUC才算高？

多高的AUC才算高？

ResysChina

7+阅读 · 2016年12月7日

相关论文

Minimax and pointwise sequential changepoint detection and identification for general stochastic models

Arxiv

0+阅读 · 2021年8月11日

Bayesian functional graphical models

Arxiv

1+阅读 · 2021年8月11日

A deep generative model for probabilistic energy forecasting in power systems: normalizing flows

Arxiv

0+阅读 · 2021年8月10日

Bayesian forecast combination using time-varying features

Bayesian forecast combination using time-varying features

Arxiv

0+阅读 · 2021年8月10日

Prediction Intervals for Synthetic Control Methods

Arxiv

0+阅读 · 2021年8月8日

Informer: Beyond Efficient Transformer for Long Sequence Time-Series Forecasting

Informer: Beyond Efficient Transformer for Long Sequence Time-Series Forecasting

Arxiv

21+阅读 · 2020年12月17日

Differential Dynamic Programming Neural Optimizer

Arxiv

7+阅读 · 2020年6月29日

Minimal Variance Sampling with Provable Guarantees for Fast Training of Graph Neural Networks

Minimal Variance Sampling with Provable Guarantees for Fast Training of Graph Neural Networks

Arxiv

13+阅读 · 2020年6月24日

Multi-Range Attentive Bicomponent Graph Convolutional Network for Traffic Forecasting

Arxiv

3+阅读 · 2019年11月27日

Reducing Parameter Space for Neural Network Training

Arxiv

3+阅读 · 2018年8月17日

微信扫码咨询专知VIP会员