Efficiently querying data on embedded sensor and IoT devices is challenging given the very limited memory and CPU resources. With the increasing volumes of collected data, it is critical to process, filter, and manipulate data on the edge devices where it is collected to improve efficiency and reduce network transmissions. Existing embedded index structures do not adapt to the data distribution and characteristics. This paper demonstrates how applying learned indexes that develop space efficient summaries of the data can dramatically improve the query performance and predictability. Learned indexes based on linear approximations can reduce the query I/O by 50 to 90% and improve query throughput by a factor of 2 to 5, while only requiring a few kilobytes of RAM. Experimental results on a variety of time series data sets demonstrate the advantages of learned indexes that considerably improve over the state-of-the-art index algorithms.
翻译:有效查询嵌入传感器和 IoT 设备的数据具有挑战性,因为内存和CPU资源非常有限。随着所收集的数据数量不断增加,处理、过滤和操作边端设备的数据至关重要,因为收集边端设备的数据是为了提高效率和减少网络传输。现有的嵌入索引结构不适应数据分布和特性。本文展示了如何应用开发空间高效数据摘要的学习指数可以极大地改善查询性能和可预测性。基于线性近似的计算指数可以将查询I/O减少50-90%,并将查询量增加2-5倍,而只需要几千字节的记录和档案管理。各种时间序列数据集的实验结果显示了学习指数的优势,这些指数大大超过最新指数算法。