Persistent Memory (PM) technologies enable program recovery to a consistent state in case of failure. To ensure this crash-consistent behavior, programs need to enforce persist ordering by employing mechanisms, such as logging and checkpointing, which introduce additional data movement.The emerging near-data processing (NDP) architectures can effectively reduce this data movement overhead by partitioning the persistent programs and executing the crash consistency mechanisms in the NDP-enabled PM. However, a significant challenge lies in maintaining the persist ordering when execution has been partitioned between the host CPU and NDP-enabled PM. In this work, we first propose Partitioned Persist Ordering (PPO) that ensures a correct persist ordering between CPU and NDP devices, as well as among multiple NDP devices. PPO guarantees high efficiency by reducing unnecessary synchronization among CPU and NDP devices. Based on PPO, we prototype an NDP system, NearPM, on an FPGA platform. NearPM executes data-intensive operations in crash consistency mechanisms with correct ordering guarantees while the rest of the program runs on the CPU. We evaluate nine PM workloads, where each workload supports three crash consistency mechanisms - logging, checkpointing, and shadow paging. Overall, NearPM achieves 4.3-9.8X speedup in the NDP-offloaded operations and 1.22-1.35X speedup in end-to-end execution.
翻译:持久性内存(PM)技术可以使程序恢复到一个在失败情况下的一致状态。 为确保这种不协调的行为,程序需要通过采用各种机制,例如伐木和检查站等机制,实施持续的订单,从而引入更多的数据流动。 新兴的近数据处理(NDP)架构可以通过分割持久性程序,并在由NDP带动的PM中执行崩溃一致性机制,有效减少数据流动管理。然而,一个重大挑战在于,当执行在主机CPU和由NDP带动的PPM之间分配时,维持持续的命令。在这项工作中,我们首先提议分解保单(PPPPO),以确保CPU和NDP装置之间以及多个NDP装置之间的正确持续订单。 PPO通过降低CPU和NDP装置之间不必要的同步性,从而保证高效地减少数据流动。基于PPPO,我们在FGA平台上建立了NPM系统(NPPM)的原型系统。 近PPM在崩溃一致性机制中执行数据密集的操作,在CPUPUPO的其余部分运行。我们评估九个PPM工作量-,在NDP- 1-X最后的运行中,在NAx AS AS AS AS AS ASload AS AS AS AS 10 AS AS AS AS AS AS AS AS AS AS AS AS AS AS AS AS AS AS AS AS AS AS AS AS 10 AS AS AS AS AS AS AS AS AS AS AS AS AS AS AS AS AS AS AS AS AS AS 。