The SPEC CPU2017 benchmark suite is an industry standard for accessing CPU performance. It adheres strictly to some workload and system configurations - arbitrary specificity - while leaving other system configurations undefined - arbitrary ambiguity. This article reveals: (1) Arbitrary specificity proves not meaningful, obscuring many scenarios, as evidenced by significant performance variations, a 74.49x performance difference observed on the same CPU. (2) Arbitrary ambiguity is unfair as it fails to establish the same configurations for comparing different CPUs. We propose an innovative CPU evaluation methodology. It considers all workload and system configurations valid and mandates each configuration to be well-defined to avoid arbitrary specificity and ambiguity. To reduce the evaluation cost, a sampling approach is proposed to select a subset of the configurations. To expose CPU performance under different scenarios, it treats all outcomes under each configuration as equally important. Finally, it utilizes confidence level and confidence interval to report the outcomes to avoid bias.
翻译:暂无翻译