采用测试数据再利用的序列算法修改 (Sequential algorithmic modification with test data reuse) - 专知论文

会员服务 ·

0

邦弗朗尼校正 · Extensibility · 测试数据 · 数据集 · Performer ·

2022 年 3 月 21 日

Sequential algorithmic modification with test data reuse

翻译：采用测试数据再利用的序列算法修改

Jean Feng,Gene Pennello,Nicholas Petrick,Berkman Sahiner,Romain Pirracchio,Alexej Gossmann

After initial release of a machine learning algorithm, the model can be fine-tuned by retraining on subsequently gathered data, adding newly discovered features, or more. Each modification introduces a risk of deteriorating performance and must be validated on a test dataset. It may not always be practical to assemble a new dataset for testing each modification, especially when most modifications are minor or are implemented in rapid succession. Recent works have shown how one can repeatedly test modifications on the same dataset and protect against overfitting by (i) discretizing test results along a grid and (ii) applying a Bonferroni correction to adjust for the total number of modifications considered by an adaptive developer. However, the standard Bonferroni correction is overly conservative when most modifications are beneficial and/or highly correlated. This work investigates more powerful approaches using alpha-recycling and sequentially-rejective graphical procedures (SRGPs). We introduce novel extensions that account for correlation between adaptively chosen algorithmic modifications. In empirical analyses, the SRGPs control the error rate of approving unacceptable modifications and approve a substantially higher number of beneficial modifications than previous approaches.

翻译：在机器学习算法初始发布后,该模型可以通过对随后收集的数据进行再培训、添加新发现的特征或更多内容进行微调。每处修改都存在性能恶化的风险,必须在测试数据集中验证。为测试每项修改,特别是当大多数修改是轻微的或迅速连续实施的时,收集新的数据集可能并不总是切合实际。最近的工作表明,人们如何反复测试同一数据集的修改,并保护人们不因下列原因而过度适应(i) 沿着网格将测试结果分解,以及(ii) 应用Bonferroni校正来调整适应适应性开发商考虑的修改总数。然而,如果大多数修改是有益和(或)高度关联,Bonferroni标准校正过于保守。这项工作调查了使用字母翻版和按顺序选择的图形程序(SRGPs)的更强有力的方法。我们引入了新的扩展,以考虑到适应性选择的算法修改之间的关联性。在经验分析中,SRGP控制了批准不可接受的修改的错误率,并批准比以往方法高得多的有益修改。

0

相关内容

邦弗朗尼校正

邦弗朗尼校正

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

74+阅读 · 2022年3月15日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

77+阅读 · 2020年7月26日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

94+阅读 · 2020年3月12日

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

专知会员服务

75+阅读 · 2020年2月8日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

33+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

149+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

174+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

77+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

103+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

39+阅读 · 2019年10月9日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

24+阅读 · 2019年5月22日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

41+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

16+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

图像分割中若干图论问题的研究

国家自然科学基金

0+阅读 · 2014年12月31日

TMS1基因响应高温胁迫和ER Stress的分子机制

国家自然科学基金

0+阅读 · 2014年12月31日

函数空间、几何和Mahler测度

国家自然科学基金

0+阅读 · 2014年12月31日

玉米响应干旱胁迫的甲基化调控与分子机制解析

国家自然科学基金

0+阅读 · 2014年12月31日

几类扩散过程的逼近及应用

国家自然科学基金

1+阅读 · 2014年12月31日

细菌胞外蛋白酶C末端PPC结构域吸附功能的多样性及其分子机制

国家自然科学基金

0+阅读 · 2012年12月31日

Cystatin B缺失与Prion疾病自噬作用机制的研究

国家自然科学基金

0+阅读 · 2011年12月31日

相关于算子的Orlicz-型函数空间的实变理论

国家自然科学基金

0+阅读 · 2011年12月31日

烟草内生菌多样性研究

国家自然科学基金

0+阅读 · 2008年12月31日

启发式算法设计中的骨架分析与应用

国家自然科学基金

0+阅读 · 2008年12月31日

A Two-Time-Scale Stochastic Optimization Framework with Applications in Control and Reinforcement Learning

A Two-Time-Scale Stochastic Optimization Framework with Applications in Control and Reinforcement Learning

Arxiv

0+阅读 · 2022年4月20日

Revisiting Vicinal Risk Minimization for Partially Supervised Multi-Label Classification Under Data Scarcity

Revisiting Vicinal Risk Minimization for Partially Supervised Multi-Label Classification Under Data Scarcity

Arxiv

0+阅读 · 2022年4月19日

Quantized Federated Learning under Transmission Delay and Outage Constraints

Arxiv

0+阅读 · 2022年4月17日

Evaluating Factuality in Text Simplification

Arxiv

0+阅读 · 2022年4月15日

Prefix-Free Coding for LQG Control

Prefix-Free Coding for LQG Control

Arxiv

0+阅读 · 2022年4月15日

Recent Advances and Trends in Multimodal Deep Learning: A Review

Arxiv

56+阅读 · 2021年5月24日

A Wholistic View of Continual Learning with Deep Neural Networks: Forgotten Lessons and the Bridge to Active and Open World Learning

Arxiv

35+阅读 · 2020年9月3日

Few-Shot Graph Classification with Model Agnostic Meta-Learning

Arxiv

23+阅读 · 2020年3月18日

AdderNet: Do We Really Need Multiplications in Deep Learning?

AdderNet: Do We Really Need Multiplications in Deep Learning?

Arxiv

10+阅读 · 2019年12月31日

Deep learning for time series classification: a review

Arxiv

12+阅读 · 2019年3月14日

VIP会员

文章信息

相关主题

邦弗朗尼校正

相关VIP内容

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

74+阅读 · 2022年3月15日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

77+阅读 · 2020年7月26日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

94+阅读 · 2020年3月12日

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

专知会员服务

75+阅读 · 2020年2月8日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

33+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

149+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

174+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

77+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

103+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

39+阅读 · 2019年10月9日

热门VIP内容

相关资讯

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

24+阅读 · 2019年5月22日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

41+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

16+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

相关论文

A Two-Time-Scale Stochastic Optimization Framework with Applications in Control and Reinforcement Learning

A Two-Time-Scale Stochastic Optimization Framework with Applications in Control and Reinforcement Learning

Arxiv

0+阅读 · 2022年4月20日

Revisiting Vicinal Risk Minimization for Partially Supervised Multi-Label Classification Under Data Scarcity

Revisiting Vicinal Risk Minimization for Partially Supervised Multi-Label Classification Under Data Scarcity

Arxiv

0+阅读 · 2022年4月19日

Quantized Federated Learning under Transmission Delay and Outage Constraints

Arxiv

0+阅读 · 2022年4月17日

Evaluating Factuality in Text Simplification

Arxiv

0+阅读 · 2022年4月15日

Prefix-Free Coding for LQG Control

Prefix-Free Coding for LQG Control

Arxiv

0+阅读 · 2022年4月15日

Recent Advances and Trends in Multimodal Deep Learning: A Review

Arxiv

56+阅读 · 2021年5月24日

A Wholistic View of Continual Learning with Deep Neural Networks: Forgotten Lessons and the Bridge to Active and Open World Learning

Arxiv

35+阅读 · 2020年9月3日

Few-Shot Graph Classification with Model Agnostic Meta-Learning

Arxiv

23+阅读 · 2020年3月18日

AdderNet: Do We Really Need Multiplications in Deep Learning?

AdderNet: Do We Really Need Multiplications in Deep Learning?

Arxiv

10+阅读 · 2019年12月31日

Deep learning for time series classification: a review

Arxiv

12+阅读 · 2019年3月14日

相关基金

图像分割中若干图论问题的研究

国家自然科学基金

0+阅读 · 2014年12月31日

TMS1基因响应高温胁迫和ER Stress的分子机制

国家自然科学基金

0+阅读 · 2014年12月31日

函数空间、几何和Mahler测度

国家自然科学基金

0+阅读 · 2014年12月31日

玉米响应干旱胁迫的甲基化调控与分子机制解析

国家自然科学基金

0+阅读 · 2014年12月31日

几类扩散过程的逼近及应用

国家自然科学基金

1+阅读 · 2014年12月31日

细菌胞外蛋白酶C末端PPC结构域吸附功能的多样性及其分子机制

国家自然科学基金

0+阅读 · 2012年12月31日

Cystatin B缺失与Prion疾病自噬作用机制的研究

国家自然科学基金

0+阅读 · 2011年12月31日

相关于算子的Orlicz-型函数空间的实变理论

国家自然科学基金

0+阅读 · 2011年12月31日

烟草内生菌多样性研究

国家自然科学基金

0+阅读 · 2008年12月31日

启发式算法设计中的骨架分析与应用

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员