The front end bottleneck in datacenter workloads has come under increased scrutiny, with the growing code footprint, involvement of numerous libraries and OS services, and the unpredictability in the instruction stream. Our examination of these workloads points to burstiness in accesses to instruction blocks, which has also been observed in data accesses. Such burstiness is largely due to spatial and short-duration temporal localities, that LRU fails to recognize and optimize for, when a single cache caters to both forms of locality. Instead, we incorporate a small i-Filter as in previous works to separate spatial from temporal accesses. However, a simple separation does not suffice, and we additionally need to predict whether the block will continue to have temporal locality, after the burst of spatial locality. This combination of i-Filter and temporal locality predictor constitutes our Admission-Controlled Instruction Cache (ACIC). ACIC outperforms a number of state-of-the-art pollution reduction techniques (replacement algorithms, bypassing mechanisms, victim caches), providing 1.0223 speedup on the average over a baseline LRU based conventional i-cache (bridging over half of the gap between LRU and OPT) across several datacenter workloads.
翻译:数据中心工作量的前端瓶颈已经受到越来越多的审查,代码足迹日益增大,众多图书馆和OS服务的参与以及教学流的不可预测性都日益受到检查。我们对这些工作量的审查表明,在进入数据时也观察到了使用指令区块的不便。这种不及时性在很大程度上是由于空间和短期时间点,当单个缓存满足两种形式的地方需要时,路运联盟无法识别和优化。相反,我们加入了一个小型的i-过滤器,如以前的工作那样,将空间与时间存取分开。然而,简单的分离是不够的,我们还需要预测在空间地点破灭后,街区是否将继续有时间地点。i-Filter和时间地点预测器的这种组合构成了我们的接收控制指示缓冲(ACIC)。 ACIC超越了一系列州级减少污染技术(替换算法、绕过机制、受害者缓存),为基于常规中心工作量的LRUU平均基线之间的半差距提供了1.0223加速度。