Method comparisons are essential to provide recommendations and guidance for applied researchers, who often have to choose from a plethora of available approaches. While many comparisons exist in the literature, these are often not neutral but favour a novel method. Apart from the choice of design and a proper reporting of the findings, there are different approaches concerning the underlying data for such method comparison studies. Most manuscripts on statistical methodology rely on simulation studies and provide a single real-world data set as an example to motivate and illustrate the methodology investigated. In the context of supervised learning, in contrast, methods are often evaluated using so-called benchmarking data sets, i.e. real-world data that serve as gold standard in the community. Simulation studies, on the other hand, are much less common in this context. The aim of this paper is to investigate differences and similarities between these approaches, to discuss their advantages and disadvantages and ultimately to develop new approaches to the evaluation of methods picking the best of both worlds. To this aim, we borrow ideas from different contexts such as mixed methods research and Clinical Scenario Evaluation.
翻译:方法比较对于为应用研究人员提供建议和指导至关重要,这些研究人员往往不得不从大量现有方法中作出选择。文献中存在许多比较,但这些比较往往不是中立的,而是倾向于一种新颖的方法。除了选择设计和适当报告调查结果外,关于方法比较研究的基础数据有不同的方法。关于统计方法的大多数手稿依靠模拟研究,提供单一的现实世界数据集作为激励和说明所调查方法的范例。在监督学习方面,方法往往使用所谓的基准数据集来评估,即作为社区黄金标准的真实世界数据。另一方面,模拟研究在这方面则少得多。本文的目的是调查这些方法之间的差异和相似之处,讨论其利弊,并最终制定新的方法来评价选择两个世界的最佳方法。为了这个目的,我们借用了不同背景的想法,例如混合方法研究和临床假设评估。