This paper investigates the architectural features and performance potential of the Apple Silicon M-Series SoCs (M1, M2, M3, and M4) for HPC. We provide a detailed review of the CPU and GPU designs, the unified memory architecture, and coprocessors such as Advanced Matrix Extensions (AMX). We design and develop benchmarks in the Metal Shading Language and Objective-C++ to assess computational and memory performance. We also measure power consumption and efficiency using Apple's powermetrics tool. Our results show that M-Series chips offer relatively high memory bandwidth and significant improvements in computational performance, particularly with the GPU outperforming the CPU from the M2 onward, peaking at 2.9 FP32 TFLOPS for the M4. Power consumption varies from a few watts to 10-20 watts, with more than 200 GFLOPS per Watt efficiency of GPU and accelerator reached by all four chips. Despite limitations in FP64 support on the GPU, the M-Series chips demonstrate strong potential for energy-efficient HPC applications. Our analysis examines whether the M-Series chips provide a competitive alternative to traditional HPC architectures or represent a distinct category altogether -- an apples-to-oranges comparison.
翻译:暂无翻译