机械学习研究的推论可复制性 (Towards Inferential Reproducibility of Machine Learning Research) - 专知论文

会员服务 ·

0

噪声 · Analysis · 方差 · Machine Learning · INTERACT ·

2023 年 2 月 8 日

Towards Inferential Reproducibility of Machine Learning Research

翻译：机械学习研究的推论可复制性

Michael Hagmann,Stefan Riezler

Reliability of machine learning evaluation -- the consistency of observed evaluation scores across replicated model training runs -- is affected by several sources of nondeterminism which can be regarded as measurement noise. Current tendencies to remove noise in order to enforce reproducibility of research results neglect inherent nondeterminism at the implementation level and disregard crucial interaction effects between algorithmic noise factors and data properties. This limits the scope of conclusions that can be drawn from such experiments. Instead of removing noise, we propose to incorporate several sources of variance, including their interaction with data properties, into an analysis of significance and reliability of machine learning evaluation, with the aim to draw inferences beyond particular instances of trained models. We show how to use linear mixed effects models (LMEMs) to analyze performance evaluation scores, and to conduct statistical inference with a generalized likelihood ratio test (GLRT). This allows us to incorporate arbitrary sources of noise like meta-parameter variations into statistical significance testing, and to assess performance differences conditional on data properties. Furthermore, a variance component analysis (VCA) enables the analysis of the contribution of noise sources to overall variance and the computation of a reliability coefficient by the ratio of substantial to total variance.

翻译：机器学习评价的可靠性 -- -- 在复制模式培训过程中观察到的评价分数的一致性 -- -- 受到若干不确定性来源的影响,这些来源可被视为测量噪音;目前消除噪音的趋势,以强制实施研究成果的可复制性;忽视执行一级固有的不确定性,忽视算法噪音因素和数据特性之间的关键互动效应;这限制了从这种实验中得出的结论的范围;我们提议在分析机器学习评价的重要性和可靠性时,纳入若干差异来源,包括它们与数据特性的相互作用,目的是在经过培训的模型的特定例子之外得出推论;我们展示如何使用线性混合效应模型分析业绩评价分数,并用普遍概率比值测试进行统计推论;这使我们能够将诸如元参数变化等任意的噪音来源纳入统计意义测试,并根据数据特性评估性差。此外,差异组成部分分析有助于分析噪音来源对总体差异的贡献,以及根据总体差异与总体差异的比例计算可靠性系数。

0

相关内容

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

局部学习的特征选择：Local-Learning-Based Feature Selection

局部学习的特征选择：Local-Learning-Based Feature Selection

我爱读PAMI

14+阅读 · 2019年9月20日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

AI实战圣经《Machine Learning Yearning》第1-52章中英文版pdf分享

AI实战圣经《Machine Learning Yearning》第1-52章中英文版pdf分享

深度学习与NLP

15+阅读 · 2018年9月8日

【推荐】RNN/LSTM时序预测

【推荐】RNN/LSTM时序预测

机器学习研究会

25+阅读 · 2017年9月8日

MC-LTH1降解微囊藻毒素的机理及降解产物的肝细胞毒性研究

国家自然科学基金

0+阅读 · 2015年12月31日

Chemerin通过调节p38MAPK通路参与动脉粥样硬化分子机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

基于Morrey空间的函数空间实变理论及其应用

国家自然科学基金

0+阅读 · 2014年12月31日

Long non-coding RNA MEG3分子对胶质瘤干细胞调控作用的研究

国家自然科学基金

0+阅读 · 2013年12月31日

β2-AR/PKA通路在内皮祖细胞修复急性肾损伤中的作用及机制

国家自然科学基金

0+阅读 · 2013年12月31日

基于SURE/PURE准则的图像盲反卷积算法研究

国家自然科学基金

3+阅读 · 2013年12月31日

Schrodinger-Poisson方程的若干问题研究

国家自然科学基金

1+阅读 · 2012年12月31日

PGC-1α调节骨骼肌脂肪酸代谢和胰岛素抵抗的分子机制

国家自然科学基金

0+阅读 · 2012年12月31日

Cajal样间质细胞调控Oddi括约肌运动及在Oddi括约肌功能障碍发病中的作用研究

国家自然科学基金

0+阅读 · 2012年12月31日

脂肪因子Chemerin在骨骼肌胰岛素抵抗发生中的作用及其机制

国家自然科学基金

0+阅读 · 2008年12月31日

Reproducibility is Nothing without Correctness: The Importance of Testing Code in NLP

Reproducibility is Nothing without Correctness: The Importance of Testing Code in NLP

Arxiv

0+阅读 · 2023年3月31日

Towards Outcome-Driven Patient Subgroups: A Machine Learning Analysis Across Six Depression Treatment Studies

Towards Outcome-Driven Patient Subgroups: A Machine Learning Analysis Across Six Depression Treatment Studies

Arxiv

0+阅读 · 2023年3月30日

One Step to Efficient Synthetic Data

Arxiv

0+阅读 · 2023年3月29日

Dynamical Modularity in Automata Models of Biochemical Networks

Arxiv

0+阅读 · 2023年3月29日

Hallucinations in Large Multilingual Translation Models

Arxiv

0+阅读 · 2023年3月28日

Towards Out-Of-Distribution Generalization: A Survey

Arxiv

38+阅读 · 2021年8月31日

Machine Reading Comprehension: The Role of Contextualized Language Models and Beyond

Arxiv

15+阅读 · 2020年5月13日

A Survey of Adversarial Learning on Graphs

Arxiv

38+阅读 · 2020年3月10日

A Comprehensive Survey on Transfer Learning

A Comprehensive Survey on Transfer Learning

Arxiv

121+阅读 · 2019年11月7日

Interpretable machine learning: definitions, methods, and applications

Interpretable machine learning: definitions, methods, and applications

Arxiv

19+阅读 · 2019年1月14日

VIP会员

文章信息

相关主题

Machine Learning

相关VIP内容

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

新书册《几何深度学习的数学基础》

中程单向攻击无人机的战略意义：俄乌战争启示

在无标注条件下适配视觉—语言模型：全面综述

面向视觉语言模型的持续学习：遗忘之外的综述与分类体系

相关资讯

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

局部学习的特征选择：Local-Learning-Based Feature Selection

局部学习的特征选择：Local-Learning-Based Feature Selection

我爱读PAMI

14+阅读 · 2019年9月20日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

AI实战圣经《Machine Learning Yearning》第1-52章中英文版pdf分享

AI实战圣经《Machine Learning Yearning》第1-52章中英文版pdf分享

深度学习与NLP

15+阅读 · 2018年9月8日

【推荐】RNN/LSTM时序预测

【推荐】RNN/LSTM时序预测

机器学习研究会

25+阅读 · 2017年9月8日

相关论文

Reproducibility is Nothing without Correctness: The Importance of Testing Code in NLP

Reproducibility is Nothing without Correctness: The Importance of Testing Code in NLP

Arxiv

0+阅读 · 2023年3月31日

Towards Outcome-Driven Patient Subgroups: A Machine Learning Analysis Across Six Depression Treatment Studies

Towards Outcome-Driven Patient Subgroups: A Machine Learning Analysis Across Six Depression Treatment Studies

Arxiv

0+阅读 · 2023年3月30日

One Step to Efficient Synthetic Data

Arxiv

0+阅读 · 2023年3月29日

Dynamical Modularity in Automata Models of Biochemical Networks

Arxiv

0+阅读 · 2023年3月29日

Hallucinations in Large Multilingual Translation Models

Arxiv

0+阅读 · 2023年3月28日

Towards Out-Of-Distribution Generalization: A Survey

Arxiv

38+阅读 · 2021年8月31日

Machine Reading Comprehension: The Role of Contextualized Language Models and Beyond

Arxiv

15+阅读 · 2020年5月13日

A Survey of Adversarial Learning on Graphs

Arxiv

38+阅读 · 2020年3月10日

A Comprehensive Survey on Transfer Learning

A Comprehensive Survey on Transfer Learning

Arxiv

121+阅读 · 2019年11月7日

Interpretable machine learning: definitions, methods, and applications

Interpretable machine learning: definitions, methods, and applications

Arxiv

19+阅读 · 2019年1月14日

相关基金

MC-LTH1降解微囊藻毒素的机理及降解产物的肝细胞毒性研究

国家自然科学基金

0+阅读 · 2015年12月31日

Chemerin通过调节p38MAPK通路参与动脉粥样硬化分子机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

基于Morrey空间的函数空间实变理论及其应用

国家自然科学基金

0+阅读 · 2014年12月31日

Long non-coding RNA MEG3分子对胶质瘤干细胞调控作用的研究

国家自然科学基金

0+阅读 · 2013年12月31日

β2-AR/PKA通路在内皮祖细胞修复急性肾损伤中的作用及机制

国家自然科学基金

0+阅读 · 2013年12月31日

基于SURE/PURE准则的图像盲反卷积算法研究

国家自然科学基金

3+阅读 · 2013年12月31日

Schrodinger-Poisson方程的若干问题研究

国家自然科学基金

1+阅读 · 2012年12月31日

PGC-1α调节骨骼肌脂肪酸代谢和胰岛素抵抗的分子机制

国家自然科学基金

0+阅读 · 2012年12月31日

Cajal样间质细胞调控Oddi括约肌运动及在Oddi括约肌功能障碍发病中的作用研究

国家自然科学基金

0+阅读 · 2012年12月31日

脂肪因子Chemerin在骨骼肌胰岛素抵抗发生中的作用及其机制

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员