评价软件项目估计预测系统 (Evaluating prediction systems in software project estimation) - 专知论文

会员服务 ·

0

估计/估计量 · 模型评估 · 可约的 · Continuity · Performer ·

2021 年 1 月 14 日

Evaluating prediction systems in software project estimation

翻译：评价软件项目估计预测系统

Martin Shepperd,Stephen G. MacDonell

from arxiv, Journal, 10 pages, 3 figures, 6 tables

Context: Software engineering has a problem in that when we empirically evaluate competing prediction systems we obtain conflicting results. Objective: To reduce the inconsistency amongst validation study results and provide a more formal foundation to interpret results with a particular focus on continuous prediction systems. Method: A new framework is proposed for evaluating competing prediction systems based upon (1) an unbiased statistic, Standardised Accuracy, (2) testing the result likelihood relative to the baseline technique of random 'predictions', that is guessing, and (3) calculation of effect sizes. Results: Previously published empirical evaluations of prediction systems are re-examined and the original conclusions shown to be unsafe. Additionally, even the strongest results are shown to have no more than a medium effect size relative to random guessing. Conclusions: Biased accuracy statistics such as MMRE are deprecated. By contrast this new empirical validation framework leads to meaningful results. Such steps will assist in performing future meta-analyses and in providing more robust and usable recommendations to practitioners.

翻译：目标:减少验证研究结果之间的不一致,为解释结果提供一个更正式的基础,特别侧重于连续预测系统。方法:提出了一个新的框架,以评价竞争性预测系统,其依据是:(1) 无偏倚的统计,标准化的准确性,(2) 测试与随机`预测'基准技术(即猜测)相比的结果可能性,(3) 计算影响大小。结果:重新审查了以前公布的预测系统经验评价,并显示最初的结论不安全。此外,甚至最有力的结果也显示,与随机猜测相比,其影响大小不超过中等。结论:MMRE等误差准确性统计数据被淡化,相比之下,这一新的经验验证框架可以产生有意义的结果。这些步骤将有助于进行未来的元分析,并向从业人员提供更有力和有用的建议。

0

相关内容

估计/估计量

估计/估计量

不可错过！UIUC最新《统计强化学习》课程！

专知会员服务

54+阅读 · 2020年9月7日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

【序列推荐系统:挑战、进展和展望】Sequential Recommender Systems

【序列推荐系统:挑战、进展和展望】Sequential Recommender Systems

专知会员服务

82+阅读 · 2020年4月25日

因果图，Causal Graphs，52页ppt

因果图，Causal Graphs，52页ppt

专知会员服务

253+阅读 · 2020年4月19日

【AAAI2020接受论文】预测性参与:开放领域对话系统自动评估的有效指标（Predictive Engagement: An Efficient Metric For Automatic Evaluation of Open-Domain Dialogue Systems）

【AAAI2020接受论文】预测性参与:开放领域对话系统自动评估的有效指标（Predictive Engagement: An Efficient Metric For Automatic Evaluation of Open-Domain Dialogue Systems）

专知会员服务

14+阅读 · 2019年11月15日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习相关资源(框架、库、软件)大列表

机器学习相关资源(框架、库、软件)大列表

专知会员服务

40+阅读 · 2019年10月9日

【RecSys 2019报告】推荐系统的意图，算法以及指标（Recommending for Impact:Intentions, Algorithms, and Metrics）

【RecSys 2019报告】推荐系统的意图，算法以及指标（Recommending for Impact:Intentions, Algorithms, and Metrics）

专知会员服务

36+阅读 · 2019年10月9日

计算机 | CCF推荐期刊专刊信息5条

计算机 | CCF推荐期刊专刊信息5条

Call4Papers

3+阅读 · 2019年4月10日

Call for Participation: Shared Tasks in NLPCC 2019

Call for Participation: Shared Tasks in NLPCC 2019

中国计算机学会

5+阅读 · 2019年3月22日

IEEE | DSC 2019诚邀稿件 (EI检索)

IEEE | DSC 2019诚邀稿件 (EI检索)

Call4Papers

10+阅读 · 2019年2月25日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

人体姿态估计资源大列表（Human Pose Estimation）

人体姿态估计资源大列表（Human Pose Estimation）

专知

9+阅读 · 2018年10月6日

【论文推荐】最新六篇主题模型相关论文—领域特定知识库、神经变分推断、动态和静态主题模型

【论文推荐】最新六篇主题模型相关论文—领域特定知识库、神经变分推断、动态和静态主题模型

专知

19+阅读 · 2018年6月26日

LibRec 精选：推荐系统9个必备数据集

LibRec 精选：推荐系统9个必备数据集

LibRec智能推荐

6+阅读 · 2018年3月7日

【推荐】迁移学习：用0.046%的训练样本(6张图片)超过2013 Kaggle猫狗识别竞赛领先水平（附代码）

【推荐】迁移学习：用0.046%的训练样本(6张图片)超过2013 Kaggle猫狗识别竞赛领先水平（附代码）

机器学习研究会

5+阅读 · 2017年9月24日

【推荐】Python机器学习生态圈(Scikit-Learn相关项目)

【推荐】Python机器学习生态圈(Scikit-Learn相关项目)

机器学习研究会

6+阅读 · 2017年8月23日

User-centered Evaluation of Popularity Bias in Recommender Systems

Arxiv

0+阅读 · 2021年3月10日

Resolving Conflict in Decision-Making for Autonomous Driving

Resolving Conflict in Decision-Making for Autonomous Driving

Arxiv

0+阅读 · 2021年3月10日

Deep Learning-Based Human Pose Estimation: A Survey

Arxiv

27+阅读 · 2020年12月24日

FuxiCTR: An Open Benchmark for Click-Through Rate Prediction

Arxiv

8+阅读 · 2020年9月12日

Predictive Engagement: An Efficient Metric For Automatic Evaluation of Open-Domain Dialogue Systems

Predictive Engagement: An Efficient Metric For Automatic Evaluation of Open-Domain Dialogue Systems

Arxiv

11+阅读 · 2019年11月4日

Challenges in Building Intelligent Open-domain Dialog Systems

Arxiv

8+阅读 · 2019年10月22日

Towards Automated Machine Learning: Evaluation and Comparison of AutoML Approaches and Tools

Towards Automated Machine Learning: Evaluation and Comparison of AutoML Approaches and Tools

Arxiv

3+阅读 · 2019年9月3日

Prediction of the FIFA World Cup 2018 - A random forest approach with an emphasis on estimated team ability parameters

Arxiv

3+阅读 · 2018年6月13日

(FPT-)Approximation Algorithms for the Virtual Network Embedding Problem

Arxiv

4+阅读 · 2018年3月12日

A Systematic Evaluation and Benchmark for Person Re-Identification: Features, Metrics, and Datasets

Arxiv

5+阅读 · 2018年2月14日

VIP会员

文章信息

相关主题

估计/估计量

相关VIP内容

不可错过！UIUC最新《统计强化学习》课程！

专知会员服务

54+阅读 · 2020年9月7日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

【序列推荐系统:挑战、进展和展望】Sequential Recommender Systems

【序列推荐系统:挑战、进展和展望】Sequential Recommender Systems

专知会员服务

82+阅读 · 2020年4月25日

因果图，Causal Graphs，52页ppt

因果图，Causal Graphs，52页ppt

专知会员服务

253+阅读 · 2020年4月19日

【AAAI2020接受论文】预测性参与:开放领域对话系统自动评估的有效指标（Predictive Engagement: An Efficient Metric For Automatic Evaluation of Open-Domain Dialogue Systems）

【AAAI2020接受论文】预测性参与:开放领域对话系统自动评估的有效指标（Predictive Engagement: An Efficient Metric For Automatic Evaluation of Open-Domain Dialogue Systems）

专知会员服务

14+阅读 · 2019年11月15日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习相关资源(框架、库、软件)大列表

机器学习相关资源(框架、库、软件)大列表

专知会员服务

40+阅读 · 2019年10月9日

【RecSys 2019报告】推荐系统的意图，算法以及指标（Recommending for Impact:Intentions, Algorithms, and Metrics）

【RecSys 2019报告】推荐系统的意图，算法以及指标（Recommending for Impact:Intentions, Algorithms, and Metrics）

专知会员服务

36+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

数据智能体综述：新兴范式还是被高估的炒作？

海底战已至：美国构思海底安全战略 | 最新报告

【ICCV2025教程】视觉异常检测中的基础模型：进展、挑战与应用

美军将无人自主等新技术融入潜艇部队以更具杀伤力

相关资讯

计算机 | CCF推荐期刊专刊信息5条

计算机 | CCF推荐期刊专刊信息5条

Call4Papers

3+阅读 · 2019年4月10日

Call for Participation: Shared Tasks in NLPCC 2019

Call for Participation: Shared Tasks in NLPCC 2019

中国计算机学会

5+阅读 · 2019年3月22日

IEEE | DSC 2019诚邀稿件 (EI检索)

IEEE | DSC 2019诚邀稿件 (EI检索)

Call4Papers

10+阅读 · 2019年2月25日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

人体姿态估计资源大列表（Human Pose Estimation）

人体姿态估计资源大列表（Human Pose Estimation）

专知

9+阅读 · 2018年10月6日

【论文推荐】最新六篇主题模型相关论文—领域特定知识库、神经变分推断、动态和静态主题模型

【论文推荐】最新六篇主题模型相关论文—领域特定知识库、神经变分推断、动态和静态主题模型

专知

19+阅读 · 2018年6月26日

LibRec 精选：推荐系统9个必备数据集

LibRec 精选：推荐系统9个必备数据集

LibRec智能推荐

6+阅读 · 2018年3月7日

【推荐】迁移学习：用0.046%的训练样本(6张图片)超过2013 Kaggle猫狗识别竞赛领先水平（附代码）

【推荐】迁移学习：用0.046%的训练样本(6张图片)超过2013 Kaggle猫狗识别竞赛领先水平（附代码）

机器学习研究会

5+阅读 · 2017年9月24日

【推荐】Python机器学习生态圈(Scikit-Learn相关项目)

【推荐】Python机器学习生态圈(Scikit-Learn相关项目)

机器学习研究会

6+阅读 · 2017年8月23日

相关论文

User-centered Evaluation of Popularity Bias in Recommender Systems

Arxiv

0+阅读 · 2021年3月10日

Resolving Conflict in Decision-Making for Autonomous Driving

Resolving Conflict in Decision-Making for Autonomous Driving

Arxiv

0+阅读 · 2021年3月10日

Deep Learning-Based Human Pose Estimation: A Survey

Arxiv

27+阅读 · 2020年12月24日

FuxiCTR: An Open Benchmark for Click-Through Rate Prediction

Arxiv

8+阅读 · 2020年9月12日

Predictive Engagement: An Efficient Metric For Automatic Evaluation of Open-Domain Dialogue Systems

Predictive Engagement: An Efficient Metric For Automatic Evaluation of Open-Domain Dialogue Systems

Arxiv

11+阅读 · 2019年11月4日

Challenges in Building Intelligent Open-domain Dialog Systems

Arxiv

8+阅读 · 2019年10月22日

Towards Automated Machine Learning: Evaluation and Comparison of AutoML Approaches and Tools

Towards Automated Machine Learning: Evaluation and Comparison of AutoML Approaches and Tools

Arxiv

3+阅读 · 2019年9月3日

Prediction of the FIFA World Cup 2018 - A random forest approach with an emphasis on estimated team ability parameters

Arxiv

3+阅读 · 2018年6月13日

(FPT-)Approximation Algorithms for the Virtual Network Embedding Problem

Arxiv

4+阅读 · 2018年3月12日

A Systematic Evaluation and Benchmark for Person Re-Identification: Features, Metrics, and Datasets

Arxiv

5+阅读 · 2018年2月14日

微信扫码咨询专知VIP会员