The overwhelming majority of High Performance Computing (HPC) systems and server infrastructure uses Intel x86 processors. This makes an architectural analysis of these processors relevant for a wide audience of administrators and performance engineers. In this paper, we describe the effects of hardware controlled energy efficiency features for the Intel Skylake-SP processor. Due to the prolonged micro-architecture cycles, which extend the previous Tick-Tock scheme by Intel, our findings will also be relevant for succeeding architectures. The findings of this paper include the following: C-state latencies increased significantly over the Haswell-EP processor generation. The mechanism that controls the uncore frequency has a latency of approximately 10 ms and it is not possible to truly fix the uncore frequency to a specific level. The out-of-order throttling for workloads using 512 bit wide vectors also occurs at low processor frequencies. Data has a significant impact on processor power consumption which causes a large error in energy models relying only on instructions.
翻译:绝大多数高性能计算系统和服务器基础设施使用 Intel x86 处理器。 这样可以对这些处理器进行建筑分析, 这些处理器与广大管理人和性能工程师相关。 在本文中, 我们描述了Intel Skylake-SP处理器硬件控制能效特性的影响。 由于长时间的微结构设计周期延长了Intel的上一个滴滴-滴计划, 我们的发现也将与今后的结构相关。 本文的研究结果包括: 在Haswell- EP处理器的生成过程中, C 状态的延迟大幅增加。 控制非核心频率的机制的延迟度约为10 ms, 无法真正将非核心频率固定在特定水平上。 使用512位宽矢量工作量的脱序抽也发生在低处理器频率上。 数据对处理器动力的消耗产生了重大影响, 这导致能源模型仅依赖指令而出现重大错误。