请不要忘记在寻求最新状态时存在的差异和互信任 (Please, Don't Forget the Difference and the Confidence Interval when Seeking for the State-of-the-Art Status) - 专知论文

会员服务 ·

0

置信度 · state-of-the-art · Performer · CASES · 自助法/自举法 ·

2022 年 5 月 23 日

Please, Don't Forget the Difference and the Confidence Interval when Seeking for the State-of-the-Art Status

翻译：请不要忘记在寻求最新状态时存在的差异和互信任

from arxiv, Accepted at LREC 2022

This paper argues for the widest possible use of bootstrap confidence intervals for comparing NLP system performances instead of the state-of-the-art status (SOTA) and statistical significance testing. Their main benefits are to draw attention to the difference in performance between two systems and to help assessing the degree of superiority of one system over another. Two cases studies, one comparing several systems and the other based on a K-fold cross-validation procedure, illustrate these benefits. A python module for obtaining these confidence intervals as well as a second function implementing the Fisher-Pitman test for paired samples are freely available on PyPi.

翻译：本文主张尽可能最广泛地利用“靴带”信任间隔来比较NLP系统性能,而不是最先进的状态和统计意义测试,其主要好处是提请注意两个系统性能的差别,帮助评估一个系统优于另一个系统的程度,两个案例研究,一个比较几个系统,另一个根据K倍交叉校验程序进行比较,说明这些好处。在PyPi可免费获得获取这些信任间隔的Python模块,另一个功能是实施渔业-Pitman对配对样品的测试。

0

相关内容

置信度

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Bacillus megaterium Q3降解二氯喹啉酸分子机理研究

国家自然科学基金

0+阅读 · 2014年12月31日

Heisenberg群与Minkowski空间中的非线性椭圆方程

国家自然科学基金

0+阅读 · 2014年12月31日

S3AGA样本（Spitzer-SDSS Spectral Atlas of Galaxies and AGNs)及其AGN研究

国家自然科学基金

0+阅读 · 2014年12月31日

Hamilton系统和几类重要椭圆方程的研究

国家自然科学基金

0+阅读 · 2014年12月31日

STK11基因第XI功能区突变激活p38在Peutz-Jeghers综合征胃肠道息肉恶变中的作用机制

国家自然科学基金

0+阅读 · 2013年12月31日

A for-loop is all you need. For solving the inverse problem in the case of personalized tumor growth modeling

Arxiv

0+阅读 · 2022年7月11日

Efficient Backward Reachability using the Minkowski Difference of Constrained Zonotopes

Arxiv

0+阅读 · 2022年7月9日

A law of adversarial risk, interpolation, and label noise

Arxiv

0+阅读 · 2022年7月8日

Non-Linear Pairwise Language Mappings for Low-Resource Multilingual Acoustic Model Fusion

Non-Linear Pairwise Language Mappings for Low-Resource Multilingual Acoustic Model Fusion

Arxiv

0+阅读 · 2022年7月7日

Comparing Confidence Intervals for a Binomial Proportion with the Interval Score

Arxiv

0+阅读 · 2022年7月7日

VIP会员

文章信息

相关主题

state-of-the-art

自助法/自举法

相关VIP内容

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

NeurIPS 2025 | NMKE：基于神经元归因与动态稀疏掩码的终身知识编辑

前沿人工智能趋势报告（Frontier AI Trends Report）

【MIT博士论文】弱监督学习：理论、方法与应用

Andrej Karpathy：2025 年 LLM 年度回顾（2025 LLM Year in Review）

相关资讯

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

A for-loop is all you need. For solving the inverse problem in the case of personalized tumor growth modeling

Arxiv

0+阅读 · 2022年7月11日

Efficient Backward Reachability using the Minkowski Difference of Constrained Zonotopes

Arxiv

0+阅读 · 2022年7月9日

A law of adversarial risk, interpolation, and label noise

Arxiv

0+阅读 · 2022年7月8日

Non-Linear Pairwise Language Mappings for Low-Resource Multilingual Acoustic Model Fusion

Non-Linear Pairwise Language Mappings for Low-Resource Multilingual Acoustic Model Fusion

Arxiv

0+阅读 · 2022年7月7日

Comparing Confidence Intervals for a Binomial Proportion with the Interval Score

Arxiv

0+阅读 · 2022年7月7日

相关基金

Bacillus megaterium Q3降解二氯喹啉酸分子机理研究

国家自然科学基金

0+阅读 · 2014年12月31日

Heisenberg群与Minkowski空间中的非线性椭圆方程

国家自然科学基金

0+阅读 · 2014年12月31日

S3AGA样本（Spitzer-SDSS Spectral Atlas of Galaxies and AGNs)及其AGN研究

国家自然科学基金

0+阅读 · 2014年12月31日

Hamilton系统和几类重要椭圆方程的研究

国家自然科学基金

0+阅读 · 2014年12月31日

STK11基因第XI功能区突变激活p38在Peutz-Jeghers综合征胃肠道息肉恶变中的作用机制

国家自然科学基金

0+阅读 · 2013年12月31日

微信扫码咨询专知VIP会员