With the rapid growth in the number of devices of the Internet of Things (IoT), the volume and types of stream data are rapidly increasing in the real world. Unfortunately, the stream data has the characteristics of infinite and periodic volatility in the real world, which cause problems with the inefficient stream processing tasks. In this study, we report our recent efforts on this issue, with a focus on simulating stream data. Firstly, we explore the characteristics of the real-world stream data of the IoT, which helps us to understand the stream data in the real world. Secondly, the pipeline of simulating stream data is proposed, which can accurately and efficiently simulate the characteristics of the stream data to improve efficiency for specific tasks. Finally, we design and implement a novel framework that can simulate various stream data for related stream processing tasks. To verify the validity of the proposed framework, we apply this framework to stream processing task running in the stream processing system. The experimental results reveal that the related stream processing task is accelerated by at least 24 times using our proposed simulation framework with the premise of ensuring volatility and trends of stream data.
翻译:随着物联网装置数量的迅速增长,流数据的数量和类型在现实世界中正在迅速增加。不幸的是,流数据在现实世界中具有无限和周期性波动的特点,造成流处理工作效率低下的问题。在本研究中,我们报告了我们最近在这个问题上的努力,重点是模拟流数据。首先,我们探讨了IoT真实世界流数据的特点,这有助于我们理解真实世界中的流数据。第二,提出了模拟流数据管道,可以准确和有效地模拟流数据的特点,以提高具体任务的效率。最后,我们设计和实施一个新的框架,可以模拟流处理工作的各种流数据。为了核实拟议框架的有效性,我们运用这一框架来验证流处理流处理工作在流处理系统中运行的任务。实验结果显示,至少24次利用我们提出的模拟框架加快了相关的流处理任务,其前提是确保流数据的波动性和趋势。