We live in a data-centric world where we are heading to generate close to 200 Zettabytes of data by the year 2025. Our data processing requirements have also increased as we push to build data processing frameworks that can process large volumes of data in a short duration, a few milli- and even micro-seconds. In the prevalent computer systems designs, data is stored passively in storage devices which is brought in for processing and then the results are written out. As the volume of data explodes this constant data movement has led to a "data movement wall" which hinders further process and optimizations in data processing systems designs. One promising alternative to this architecture is to push computation to the data (instead of the other way around), and design a computational-storage device or CSD. The idea of CSD is not new and can trace its root to the pioneering work done in the 1970s and 1990s. More recently, with the emergence of non-volatile memory (NVM) storage in the mainstream computing (e.g., NAND flash and Optane), the idea has again gained a lot of traction with multiple academic and commercial prototypes being available now. In this brief survey we present a systematic analysis of work done in the area of computation storage and present future directions.
翻译:我们生活在一个以数据为中心的世界中,我们正准备在2025年之前产生近200个Zettabyte的数据。我们的数据处理要求也随着我们推动建立数据处理框架而增加,这种框架可以在短时期内处理大量数据,只有几毫甚至微秒。在流行的计算机系统设计中,数据被被动地储存在储存装置中,然后输入处理,然后将结果写出来。随着数据量的增多,这一数据不断移动导致“数据移动墙”阻碍数据处理系统设计的进一步进程和优化。这一结构的一个有希望的替代办法是将计算推到数据上(而不是其他方法),并设计一个计算储存装置或CSDW。CD的构想并不是新的,它可以追溯到1970年代和1990年代的开创性工作。最近,随着主流计算中出现非挥发性内存(NVM)存储器(例如NAND闪光和Optane),这个想法再次获得大量牵引力,目前已有多个学术和商业原型的系统化计算领域。