BigDataBenench:以矮人为基础的大数据和AI基准套套 (BigDataBench: A Dwarf-based Big Data and AI Benchmark Suite)

As architecture, system, data management, and machine learning communities pay greater attention to innovative big data and data-driven artificial intelligence (in short, AI) algorithms, architecture, and systems, the pressure of benchmarking rises. However, complexity, diversity, frequently changed workloads, and rapid evolution of big data, especially AI systems raise great challenges in benchmarking. First, for the sake of conciseness, benchmarking scalability, portability cost, reproducibility, and better interpretation of performance data, we need understand what are the abstractions of frequently-appearing units of computation, which we call dwarfs, among big data and AI workloads. Second, for the sake of fairness, the benchmarks must include diversity of data and workloads. Third, for co-design of software and hardware, the benchmarks should be consistent across different communities. Other than creating a new benchmark or proxy for every possible workload, we propose using dwarf-based benchmarks--the combination of eight dwarfs--to represent diversity of big data and AI workloads. The current version--BigDataBench 4.0 provides 13 representative real-world data sets and 47 big data and AI benchmarks, including seven workload types: online service, offline analytics, graph analytics, AI, data warehouse, NoSQL, and streaming. BigDataBench 4.0 is publicly available from http://prof.ict.ac.cn/BigDataBench. Also, for the first time, we comprehensively characterize the benchmarks of seven workload types in BigDataBench 4.0 in addition to traditional benchmarks like SPECCPU, PARSEC and HPCC in a hierarchical manner and drill down on five levels, using the Top-Down analysis from an architecture perspective.

翻译：由于架构、系统、数据管理以及机器学习社区更加关注创新的海量数据和数据驱动人工智能(简称AI)算法、架构和系统,因此基准制定的压力上升。然而,复杂性、多样性、频繁变化的工作量以及海量数据、特别是AI系统迅速演变在基准制定方面提出了巨大的挑战。首先,为了简洁、基准可缩放性、可移植性成本、可移植性、可复制性、以及更好地解读业绩数据,我们需要理解经常出现的计算单位(我们称之为大数据和AI工作量中的侏儒)的抽象内容。第二,为了公平起见,基准必须包括数据和工作量的多样性。第三,对于软件和硬件的共同设计,以及海量数据的快速演变,基准应该在不同社区之间保持一致。除了为每一种可能的工作量创建新的基准或代谢性之外,我们提议使用以侏儒为基础的基准-8个侏儒-组合来代表大数据和AI工作量的多样性。目前版本-BigDench 4.0提供了13个具有代表性的真实世界数据基数,47个具有代表性的海量基准,Beal-al-al-al-al-al-alg-alg-algal 数据和AI-ligal 数据在Sal-al-al-al-al-al-laxal-lax dal 上,在Slationalxxxxxxxxxxxxxxxxxxxx。在Slxxxxxxxxxxxxxxxxxxxxxxxxx。在Slxxxxxxx。在Slxxxxxxxxxxx上,在Slxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx