深网络何时真的比小样本大小的随机森林更好? (When are Deep Networks really better than Random Forests at small sample sizes?) - 专知论文

会员服务 ·

0

Performer · Better · 随机森林 · Networking · 样本 ·

2021 年 8 月 31 日

When are Deep Networks really better than Random Forests at small sample sizes?

翻译：深网络何时真的比小样本大小的随机森林更好?

Haoyin Xu,Michael Ainsworth,Yu-Chung Peng,Madi Kusmanov,Sambit Panda,Joshua T. Vogelstein

Random forests (RF) and deep networks (DN) are two of the most popular machine learning methods in the current scientific literature and yield differing levels of performance on different data modalities. We wish to further explore and establish the conditions and domains in which each approach excels, particularly in the context of sample size and feature dimension. To address these issues, we tested the performance of these approaches across tabular, image, and audio settings using varying model parameters and architectures. Our focus is on datasets with at most 10,000 samples, which represent a large fraction of scientific and biomedical datasets. In general, we found RF to excel at tabular and structured data (image and audio) with small sample sizes, whereas DN performed better on structured data with larger sample sizes. Although we plan to continue updating this technical report in the coming months, we believe the current preliminary results may be of interest to others.

翻译：随机森林(RF)和深海网络(DN)是当前科学文献中最流行的两种机器学习方法,在不同数据模式方面产生不同程度的绩效,我们希望进一步探索和确定每种方法优异的条件和领域,特别是在抽样规模和特征方面。为了解决这些问题,我们利用不同的模型参数和结构,在表格、图像和音频设置中测试了这些方法的性能。我们的重点是最多有10 000个样本的数据集,这些样本占科学和生物医学数据集的一大部分。一般而言,我们发现RF以表格和结构化数据(图像和音频)为优势,样本规模较小,而DN在结构化数据方面表现更好,样本规模较大。虽然我们计划在未来几个月继续更新这份技术报告,但我们认为目前的初步结果可能令其他人感兴趣。

0

相关内容

Performer

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

商业数据分析，39页ppt

商业数据分析，39页ppt

专知会员服务

165+阅读 · 2020年6月2日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

专知会员服务

77+阅读 · 2020年2月8日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

【TED】什么让我们生病

【TED】什么让我们生病

英语演讲视频每日一推

7+阅读 · 2019年1月23日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

利用动态深度学习预测金融时间序列基于Python

利用动态深度学习预测金融时间序列基于Python

量化投资与机器学习

18+阅读 · 2018年10月30日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【学习】(Python)SVM数据分类

【学习】(Python)SVM数据分类

机器学习研究会

6+阅读 · 2017年10月15日

【推荐】SVM实例教程

【推荐】SVM实例教程

机器学习研究会

17+阅读 · 2017年8月26日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

Berry-Esseen Bounds for Projection Parameters and Partial Correlations with Increasing Dimension

Arxiv

0+阅读 · 2021年10月22日

Time Series Forecasting Using Manifold Learning

Arxiv

0+阅读 · 2021年10月21日

Time Series is a Special Sequence: Forecasting with Sample Convolution and Interaction

Arxiv

0+阅读 · 2021年10月19日

Z-GCNETs: Time Zigzags at Graph Convolutional Networks for Time Series Forecasting

Arxiv

10+阅读 · 2021年5月10日

End-to-End Object Detection with Fully Convolutional Network

Arxiv

9+阅读 · 2021年3月26日

REST: Relational Event-driven Stock Trend Forecasting

Arxiv

15+阅读 · 2021年2月19日

Theoretical Analysis of Self-Training with Deep Networks on Unlabeled Data

Arxiv

9+阅读 · 2021年2月8日

Scattering GCN: Overcoming Oversmoothness in Graph Convolutional Networks

Arxiv

3+阅读 · 2020年3月18日

Cluster-GCN: An Efficient Algorithm for Training Deep and Large Graph Convolutional Networks

Arxiv

14+阅读 · 2019年8月8日

Bayesian Convolutional Neural Networks

Arxiv

19+阅读 · 2018年6月27日

VIP会员

文章信息

相关主题

相关VIP内容

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

商业数据分析，39页ppt

商业数据分析，39页ppt

专知会员服务

165+阅读 · 2020年6月2日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

专知会员服务

77+阅读 · 2020年2月8日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《分析与预测陆军战斗体能测试表现：统计与机器学习方法》2025最新137页

《军事行动中的人机协同共同学习》2025最新文献

代理式人工智能时代的决策优势

《F/A-18机队替换中队仿真模型的设计与分析》2025最新73页

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

【TED】什么让我们生病

【TED】什么让我们生病

英语演讲视频每日一推

7+阅读 · 2019年1月23日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

利用动态深度学习预测金融时间序列基于Python

利用动态深度学习预测金融时间序列基于Python

量化投资与机器学习

18+阅读 · 2018年10月30日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【学习】(Python)SVM数据分类

【学习】(Python)SVM数据分类

机器学习研究会

6+阅读 · 2017年10月15日

【推荐】SVM实例教程

【推荐】SVM实例教程

机器学习研究会

17+阅读 · 2017年8月26日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

相关论文

Berry-Esseen Bounds for Projection Parameters and Partial Correlations with Increasing Dimension

Arxiv

0+阅读 · 2021年10月22日

Time Series Forecasting Using Manifold Learning

Arxiv

0+阅读 · 2021年10月21日

Time Series is a Special Sequence: Forecasting with Sample Convolution and Interaction

Arxiv

0+阅读 · 2021年10月19日

Z-GCNETs: Time Zigzags at Graph Convolutional Networks for Time Series Forecasting

Arxiv

10+阅读 · 2021年5月10日

End-to-End Object Detection with Fully Convolutional Network

Arxiv

9+阅读 · 2021年3月26日

REST: Relational Event-driven Stock Trend Forecasting

Arxiv

15+阅读 · 2021年2月19日

Theoretical Analysis of Self-Training with Deep Networks on Unlabeled Data

Arxiv

9+阅读 · 2021年2月8日

Scattering GCN: Overcoming Oversmoothness in Graph Convolutional Networks

Arxiv

3+阅读 · 2020年3月18日

Cluster-GCN: An Efficient Algorithm for Training Deep and Large Graph Convolutional Networks

Arxiv

14+阅读 · 2019年8月8日

Bayesian Convolutional Neural Networks

Arxiv

19+阅读 · 2018年6月27日

微信扫码咨询专知VIP会员