User-facing applications running in modern datacenters exhibit irregular request patterns and are implemented using a multitude of services with tight latency requirements. These characteristics render ineffective existing energy conserving techniques when processors are idle due to the long transition time from a deep idle power state (C-state). While prior works propose management techniques to mitigate this inefficiency, we tackle it at its root with AgileWatts (AW): a new deep C-state architecture optimized for datacenter server processors targeting latency-sensitive applications. AW is based on three key ideas. First, AW eliminates the latency overhead of saving/restoring the core context (i.e., micro-architectural state) when powering-off/-on the core in a deep idle power state by i) implementing medium-grained power-gates, carefully distributed across the CPU core, and ii) retaining context in the power-ungated domain. Second, AW eliminates the flush latency overhead (several tens of microseconds) of the L1/L2 caches when entering a deep idle power state by keeping L1/L2 cache content power-ungated. A minimal control logic also remains power-ungated to serve cache coherence traffic (i.e., snoops) seamlessly. AW implements sleep-mode in caches to reduce caches leakage power consumption and lowers a core voltage to the minimum operational voltage level to minimize the leakage power of the power-ungated domain. Third, using a state-of-the-art power efficient all-digital phase-locked loop (ADPLL) clock generator, AW keeps the PLL active and locked during the idle state, further cutting precious microseconds of wake-up latency at a negligible power cost. Our evaluation with an accurate simulator calibrated against an Intel Skylake server shows that AW reduces the energy consumption of Memcached by up to 71% (35% on average) with up to 1% performance degradation.
翻译:在现代数据中心运行的用户化应用程序存在不规则的请求模式,并且使用大量具有紧固延迟要求的服务来实施。这些特征使得当处理器从一个深度闲置的电源状态(C State)长期过渡期间闲置时,现有节能保护技术无效。虽然先前的工程提议了管理技术来减轻这种低效率,但我们用AgileWatts(AW):一个新的深度C州结构,优化了数据中心处理器的内存系统,其针对的是延缓敏感应用程序。AW基于三个关键想法。首先,AW消除了储/存储核心电源状态(即微秒)的静电保护技术。当进入一个深度闲置电源状态时,AWtrial-de(即微秒)节能流流流流流流流中,AWtreality-dealdeal-deal-decal development Orational-deal-listal listal-listal-listal listal-deal-deal-deal-deal listal listal-listal listal listal-listal-listal listal listal listal listal listal listal listal) listal listal-de listal-stal listal listal listal-stal-stal listal listal-stal-stal-stal-dementaldal listal listal ladal ladal devalmental ladal ladal ladal ladal ladalmental ladal ladal ladal ladal ladal ladal ladal ladal ladal ladal ladal ladal ladal ladal ladal ladal lidal ladal ladal lidal lidal ladal list listal listal listal i li,我们,我们,我们,我们,我们