This paper summarizes our work on experimental characterization and analysis of reduced-voltage operation in modern DRAM chips, which was published in SIGMETRICS 2017, and examines the work's significance and future potential. We take a comprehensive approach to understanding and exploiting the latency and reliability characteristics of modern DRAM when the DRAM supply voltage is lowered below the nominal voltage level specified by DRAM standards. We perform an experimental study of 124 real DDR3L (low-voltage) DRAM chips manufactured recently by three major DRAM vendors. We find that reducing the supply voltage below a certain point introduces bit errors in the data, and we comprehensively characterize the behavior of these errors. We discover that these errors can be avoided by increasing the latency of three major DRAM operations (activation, restoration, and precharge). We perform detailed DRAM circuit simulations to validate and explain our experimental findings. We also characterize the various relationships between reduced supply voltage and error locations, stored data patterns, DRAM temperature, and data retention. Based on our observations, we propose a new DRAM energy reduction mechanism, called Voltron. The key idea of Voltron is to use a performance model to determine by how much we can reduce the supply voltage without introducing errors and without exceeding a user-specified threshold for performance loss. Our evaluations show that Voltron reduces the average DRAM and system energy consumption by 10.5% and 7.3%, respectively, while limiting the average system performance loss to only 1.8%, for a variety of memory-intensive quad-core workloads. We also show that Voltron significantly outperforms prior dynamic voltage and frequency scaling mechanisms for DRAM.
翻译:本文总结了我们在2017年SIGMETRICS中公布的现代DRAM芯片降低压力操作的实验性定性和分析工作,并审视了这项工作的意义和未来潜力。当DRAM供应电压降低到DRAM标准规定的名义电压水平以下时,我们采取全面的方法来理解和利用现代DRAM的延迟性和可靠性特征。我们进行了一项实验性研究,研究的是124个由3个主要的DRAM供应商最近制造的DRAM芯片(低压)的实际DLD3L(低压)。我们发现,将供应电压降低到某个点以下的某个点会给数据带来点的频率错误,我们全面描述这些错误的行为。我们发现,通过提高DRAM的三种主要操作(激活、恢复和预充电)的延迟性能和可靠性来避免这些错误。我们进行了详细的DRAM电路路模拟,以验证和解释我们的实验结果。我们还描述了供应量减少和错误地点、存储数据模式、DRAM温度和数据保留之间的各种关系。基于我们的观察,我们提出了一个新的DRAMRM节能降低成本的能源降低成本,同时,而我们又不使用前的SLVLVLM系统可以大幅降低一个运行的运行的运行的运行,我们又会显示我们如何显示一个平均性能的系统如何降低一个运行。