The emergence of the Internet of Things has seen the introduction of numerous connected devices used for the monitoring and control of even Critical Infrastructures. Distributed stream processing has become key to analyzing data generated by these connected devices and improving our ability to make decisions. However, optimizing these systems towards specific Quality of Service targets is a difficult and time-consuming task, due to the large-scale distributed systems involved, the existence of so many configuration parameters, and the inability to easily determine the impact of tuning these parameters. In this paper we present an approach for the effective testing of system configurations for critical IoT analytics pipelines. We demonstrate our approach with a prototype that we called Timon which is integrated with Kubernetes. This tool allows pipelines to be easily replicated in parallel and evaluated to determine the optimal configuration for specific applications. We demonstrate the usefulness of our approach by investigating different configurations of an exemplary geographically-based traffic monitoring application implemented in Apache Flink.
翻译:物联网的出现导致许多用于监测和控制甚至关键基础设施的连接装置的出现。分布式流处理已成为分析这些连接装置产生的数据和提高我们决策能力的关键。然而,优化这些系统以实现服务的特定质量目标是一项困难和耗时的任务,因为涉及大规模分布系统,存在如此多的配置参数,而且无法轻易确定调控这些参数的影响。在本文件中,我们提出了一个有效测试关键IoT分析管道系统配置的方法。我们展示了我们的方法,即我们称之为Timon的原型,它与Kubernetes是结合的。这一工具使管道易于同时复制和评价,以确定具体应用的最佳配置。我们通过调查阿帕奇·弗林克实施的具有示范性的地理交通监测应用程序的不同配置,证明了我们的方法的效用。