There are two major sources of inefficiency in computing systems that use modern DRAM devices as main memory. First, due to coarse-grained data transfers (size of a cache block, usually 64B between the DRAM and the memory controller, systems waste energy on transferring data that is not used. Second, due to coarse-grained DRAM row activation, systems waste energy by activating DRAM cells that are unused in many workloads where spatial locality is lower than the large row size (usually 8-16KB). We propose Sectored DRAM, a new, low-overhead DRAM substrate that alleviates the two inefficiencies, by enabling fine-grained DRAM access and activation. To efficiently retrieve only the useful data from DRAM, Sectored DRAM exploits the observation that many cache blocks are not fully utilized in many workloads due to poor spatial locality. Sectored DRAM predicts the words in a cache block that will likely be accessed during the cache block's cache residency and: (i) transfers only the predicted words on the memory channel, as opposed to transferring the entire cache block, by dynamically tailoring the DRAM data transfer size for the workload and (ii) activates a smaller set of cells that contain the predicted words, as opposed to activating the entire DRAM row, by carefully operating physically isolated portions of DRAM rows (MATs). Compared to prior work in fine-grained DRAM, Sectored DRAM greatly reduces DRAM energy consumption, does not reduce DRAM throughput, and can be implemented with low hardware cost. We evaluate Sectored DRAM using 41 workloads from widely-used benchmark suites. Sectored DRAM reduces the DRAM energy consumption of highly-memory-intensive workloads by up to (on average) 33% (20%) while improving their performance by 17% on average. Sectored DRAM's DRAM energy savings, combined with its system performance improvement, allows system-wide energy savings of up to 23%.
翻译:使用现代 DRAM 设备作为主记忆的计算机系统有两大低效率来源。 首先,由于数据传输粗糙(缓存区块的规模,通常是DRAM和存储控制器之间的64B),系统在传输数据时浪费能源而没有使用。 其次,由于缓存区块启动,系统通过激活DRAM细胞来浪费能源,而这些细胞在许多工作量中,空间位置比大行容量小(通常为8-16KB),因此使用现代 DRAM 设备。 我们提议采用部门化 DRAM, 新的低管理区块子, 以缓解两种低效率, 使缓存区块的规模缩小, 通常是在DRAM 和存储控制区之间, 系统能有效取回DRAM 的有用数据。 部门通过快速化的运行, DRAM 系统能量增长到DD 平均存储区块中, 只能将存储区块中的预言词传送到DDRAM 系统, 只能通过动态化的存储区块, 将存储区块的存储区块的存储区块进行大幅降低成本。