This article investigates the origin and evolution of the benchmark term. Five categories of benchmarks are summarized, including measurement standards, standardized data sets with defined properties, representative workloads, representative data sets, and best practices, which widely exist in multi-disciplines. I believe there are two pressing challenges in growing this discipline: establishing consistent benchmarking across multi-disciplines and developing meta-benchmark to measure the benchmarks themselves. I propose establishing benchmark science and engineering; one of the primary goal is to setup a standard benchmark hierarchy across multi-disciplines. It is the right time to launch a multi-disciplinary benchmark, standard, and evaluation journal, TBench, to communicate the state-of-the-art and state-of-the-practice of benchmark science and engineering.
翻译:本条调查基准期的起源和演变情况,概括了五类基准,包括计量标准、具有界定的特性的标准化数据集、代表性工作量、代表性数据集和最佳做法,这些在多纪律中广泛存在。我认为,在加强这一纪律方面有两个紧迫的挑战:建立跨多纪律的一致基准,并制订衡量基准的元基准。我提议建立基准科学和工程;主要目标之一是建立跨多纪律的标准基准等级。现在是推出多学科基准、标准和评价期刊《Tutench》,以宣传基准科学和工程的最新和最新实践的时候了。