关于基准衡量数据集和模拟在方法比较研究中的作用 (On the role of benchmarking data sets and simulations in method comparison studies)

Method comparisons are essential to provide recommendations and guidance for applied researchers, who often have to choose from a plethora of available approaches. While many comparisons exist in the literature, these are often not neutral but favour a novel method. Apart from the choice of design and a proper reporting of the findings, there are different approaches concerning the underlying data for such method comparison studies. Most manuscripts on statistical methodology rely on simulation studies and provide a single real-world data set as an example to motivate and illustrate the methodology investigated. In the context of supervised learning, in contrast, methods are often evaluated using so-called benchmarking data sets, i.e. real-world data that serve as gold standard in the community. Simulation studies, on the other hand, are much less common in this context. The aim of this paper is to investigate differences and similarities between these approaches, to discuss their advantages and disadvantages and ultimately to develop new approaches to the evaluation of methods picking the best of both worlds. To this aim, we borrow ideas from different contexts such as mixed methods research and Clinical Scenario Evaluation.

翻译：方法比较对于为应用研究人员提供建议和指导至关重要,这些研究人员往往不得不从大量现有方法中作出选择。文献中存在许多比较,但这些比较往往不是中立的,而是倾向于一种新颖的方法。除了选择设计和适当报告调查结果外,关于方法比较研究的基础数据有不同的方法。关于统计方法的大多数手稿依靠模拟研究,提供单一的现实世界数据集作为激励和说明所调查方法的范例。在监督学习方面,方法往往使用所谓的基准数据集来评估,即作为社区黄金标准的真实世界数据。另一方面,模拟研究在这方面则少得多。本文的目的是调查这些方法之间的差异和相似之处,讨论其利弊,并最终制定新的方法来评价选择两个世界的最佳方法。为了这个目的,我们借用了不同背景的想法,例如混合方法研究和临床假设评估。

相关内容

AIM

关注 655

医学人工智能AIM（Artificial Intelligence in Medicine）杂志发表了多学科领域的原创文章，涉及医学中的人工智能理论和实践，以医学为导向的人类生物学和卫生保健。医学中的人工智能可以被描述为与研究、项目和应用相关的科学学科，旨在通过基于知识或数据密集型的计算机解决方案支持基于决策的医疗任务，最终支持和改善人类护理提供者的性能。官网地址：http://dblp.uni-trier.de/db/journals/artmed/

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日