As more and more devices connect to Internet of Things, unbounded streams of data will be generated, which have to be processed "on the fly" in order to trigger automated actions and deliver real-time services. Spark Streaming is a popular realtime stream processing framework. To make efficient use of Spark Streaming and achieve stable stream processing, it requires a careful interplay between different parameter configurations. Mistakes may lead to significant resource overprovisioning and bad performance. To alleviate such issues, this paper develops an executable and configurable model named SSP (stands for Spark Streaming Processing) to model and simulate Spark Streaming. SSP is written in ABS, which is a formal, executable, and object-oriented language for modeling distributed systems by means of concurrent object groups. SSP allows users to rapidly evaluate and compare different parameter configurations without deploying their applications on a cluster/cloud. The simulation results show that SSP is able to mimic Spark Streaming in different scenarios.
翻译:随着越来越多的设备连接到物联网,将产生越来越多的不受限制的数据流,这些数据流必须“在苍蝇上”处理,以便触发自动动作和提供实时服务。闪烁流是一个流行的实时流处理框架。要高效使用闪烁流并实现稳定的流处理,就需要在不同参数配置之间进行仔细的相互作用。错误可能导致大量资源过多和不良性能。为了缓解这些问题,本文件开发了一个可执行和可配置的模型SSP(火花流处理站)来模拟和模拟火花流。 SSP是在ABS中写成的,这是一个正式的、可执行的和以目标为导向的语言,用来通过并行对象组来模拟分布系统。 SSP允许用户快速评估和比较不同的参数配置,而不用在集集/库中部署应用程序。模拟结果显示SSP能够在不同的情景中模拟Spark Streaming 。