减少符号回归中培训数据的数据汇总 (Data Aggregation for Reducing Training Data in Symbolic Regression) - 专知论文

会员服务 ·

0

可约的 · 随机采样 · 训练数据 · 数据缩减 · 模型评估 ·

2021 年 8 月 24 日

Data Aggregation for Reducing Training Data in Symbolic Regression

翻译：减少符号回归中培训数据的数据汇总

Lukas Kammerer,Gabriel Kronberger,Michael Kommenda

from arxiv, International Conference on Computer Aided Systems Theory 2015, pp 378-386

The growing volume of data makes the use of computationally intense machine learning techniques such as symbolic regression with genetic programming more and more impractical. This work discusses methods to reduce the training data and thereby also the runtime of genetic programming. The data is aggregated in a preprocessing step before running the actual machine learning algorithm. K-means clustering and data binning is used for data aggregation and compared with random sampling as the simplest data reduction method. We analyze the achieved speed-up in training and the effects on the trained models test accuracy for every method on four real-world data sets. The performance of genetic programming is compared with random forests and linear regression. It is shown, that k-means and random sampling lead to very small loss in test accuracy when the data is reduced down to only 30% of the original data, while the speed-up is proportional to the size of the data set. Binning on the contrary, leads to models with very high test error.

翻译：数据数量不断增加,使得使用计算密集的机器学习技术,如通过基因编程进行象征性回归,越来越不切实际。这项工作讨论了减少培训数据的方法, 从而也讨论了基因编程的运行时间。数据在实际机器学习算法运行前的预处理步骤中汇总。 K 表示群集和数据集集, 与随机抽样比较, 作为最简单的数据减少方法。我们分析了培训中实现的加速, 以及四种真实世界数据集中每种方法对经过培训的模型测试精确度的影响。基因编程的性能与随机森林和线性回归相比较。事实证明, k 表示方式和随机取样导致测试准确性在数据降至仅原始数据的30%时, 测试准确性损失很小, 而加速率则与数据集的大小成正比。相反, 我们分析在培训中实现的加速度, 以及对四个真实世界数据集中每套方法经过培训的模型测试精确度的影响。基因编程的性能与随机森林和线性回归相比。显示, k 方式和随机取样和随机取样导致测试误差极小的模型。

0

相关内容

可约的

鲁棒表示学习简述

专知会员服务

26+阅读 · 2021年4月13日

剑桥大学《数据科学: 原理与实践》课程，附PPT下载

剑桥大学《数据科学: 原理与实践》课程，附PPT下载

专知会员服务

53+阅读 · 2021年1月20日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

哥伦比亚大学最新《机器学习》课程，Fall-B 2020 (Machine Learning)

专知会员服务

39+阅读 · 2020年11月3日

神经网络序列数据建模，229页ppt，Modeling Sequential Data with Neural Nets

神经网络序列数据建模，229页ppt，Modeling Sequential Data with Neural Nets

专知会员服务

67+阅读 · 2020年7月25日

最新《自动机器学习》综述论文，AutoML: A Survey of the State-of-the-Art

最新《自动机器学习》综述论文，AutoML: A Survey of the State-of-the-Art

专知会员服务

92+阅读 · 2020年7月10日

【剑桥大学】图网络的主邻域聚合，Principal Neighbourhood Aggregation for Graph Nets

【剑桥大学】图网络的主邻域聚合，Principal Neighbourhood Aggregation for Graph Nets

专知会员服务

42+阅读 · 2020年4月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Deep Compression/Acceleration：模型压缩加速论文汇总

Deep Compression/Acceleration：模型压缩加速论文汇总

极市平台

14+阅读 · 2019年5月15日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【推荐】决策树/随机森林深入解析

【推荐】决策树/随机森林深入解析

机器学习研究会

5+阅读 · 2017年9月21日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

强化学习 cartpole_a3c

强化学习 cartpole_a3c

CreateAMind

9+阅读 · 2017年7月21日

Tricks for Training Sparse Translation Models

Arxiv

0+阅读 · 2021年10月15日

Fast Partial Quantile Regression

Arxiv

0+阅读 · 2021年10月15日

Post-selection inference on high-dimensional varying-coefficient quantile regression model

Arxiv

0+阅读 · 2021年10月14日

MixRL: Data Mixing Augmentation for Regression using Reinforcement Learning

Arxiv

0+阅读 · 2021年10月8日

Robust Linear Classification from Limited Training Data

Arxiv

0+阅读 · 2021年10月4日

LDP-FL: Practical Private Aggregation in Federated Learning with Local Differential Privacy

Arxiv

5+阅读 · 2020年7月31日

Principal Neighbourhood Aggregation for Graph Nets

Principal Neighbourhood Aggregation for Graph Nets

Arxiv

17+阅读 · 2020年6月7日

The Knowledge Within: Methods for Data-Free Model Compression

The Knowledge Within: Methods for Data-Free Model Compression

Arxiv

3+阅读 · 2019年12月3日

Adaptive Neural Trees

Adaptive Neural Trees

Arxiv

4+阅读 · 2018年12月10日

Reducing Parameter Space for Neural Network Training

Arxiv

3+阅读 · 2018年8月17日

VIP会员

文章信息

相关主题

相关VIP内容

鲁棒表示学习简述

专知会员服务

26+阅读 · 2021年4月13日

剑桥大学《数据科学: 原理与实践》课程，附PPT下载

剑桥大学《数据科学: 原理与实践》课程，附PPT下载

专知会员服务

53+阅读 · 2021年1月20日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

哥伦比亚大学最新《机器学习》课程，Fall-B 2020 (Machine Learning)

专知会员服务

39+阅读 · 2020年11月3日

神经网络序列数据建模，229页ppt，Modeling Sequential Data with Neural Nets

神经网络序列数据建模，229页ppt，Modeling Sequential Data with Neural Nets

专知会员服务

67+阅读 · 2020年7月25日

最新《自动机器学习》综述论文，AutoML: A Survey of the State-of-the-Art

最新《自动机器学习》综述论文，AutoML: A Survey of the State-of-the-Art

专知会员服务

92+阅读 · 2020年7月10日

【剑桥大学】图网络的主邻域聚合，Principal Neighbourhood Aggregation for Graph Nets

【剑桥大学】图网络的主邻域聚合，Principal Neighbourhood Aggregation for Graph Nets

专知会员服务

42+阅读 · 2020年4月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

热门VIP内容

开通专知VIP会员享更多权益服务

新书册《几何深度学习的数学基础》

中程单向攻击无人机的战略意义：俄乌战争启示

在无标注条件下适配视觉—语言模型：全面综述

面向视觉语言模型的持续学习：遗忘之外的综述与分类体系

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Deep Compression/Acceleration：模型压缩加速论文汇总

Deep Compression/Acceleration：模型压缩加速论文汇总

极市平台

14+阅读 · 2019年5月15日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【推荐】决策树/随机森林深入解析

【推荐】决策树/随机森林深入解析

机器学习研究会

5+阅读 · 2017年9月21日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

强化学习 cartpole_a3c

强化学习 cartpole_a3c

CreateAMind

9+阅读 · 2017年7月21日

相关论文

Tricks for Training Sparse Translation Models

Arxiv

0+阅读 · 2021年10月15日

Fast Partial Quantile Regression

Arxiv

0+阅读 · 2021年10月15日

Post-selection inference on high-dimensional varying-coefficient quantile regression model

Arxiv

0+阅读 · 2021年10月14日

MixRL: Data Mixing Augmentation for Regression using Reinforcement Learning

Arxiv

0+阅读 · 2021年10月8日

Robust Linear Classification from Limited Training Data

Arxiv

0+阅读 · 2021年10月4日

LDP-FL: Practical Private Aggregation in Federated Learning with Local Differential Privacy

Arxiv

5+阅读 · 2020年7月31日

Principal Neighbourhood Aggregation for Graph Nets

Principal Neighbourhood Aggregation for Graph Nets

Arxiv

17+阅读 · 2020年6月7日

The Knowledge Within: Methods for Data-Free Model Compression

The Knowledge Within: Methods for Data-Free Model Compression

Arxiv

3+阅读 · 2019年12月3日

Adaptive Neural Trees

Adaptive Neural Trees

Arxiv

4+阅读 · 2018年12月10日

Reducing Parameter Space for Neural Network Training

Arxiv

3+阅读 · 2018年8月17日

微信扫码咨询专知VIP会员