Devices and sensors generate streams of data across a diversity of locations and protocols. That data usually reaches a central platform that is used to store and process the streams. Processing can be done in real time, with transformations and enrichment happening on-the-fly, but it can also happen after data is stored and organized in repositories. In the former case, stream processing technologies are required to operate on the data; in the latter batch analytics and queries are of common use. This paper introduces a runtime to dynamically construct data stream processing topologies based on user-supplied code. These dynamic topologies are built on-the-fly using a data subscription model defined by the applications that consume data. Each user-defined processing unit is called a Service Object. Every Service Object consumes input data streams and may produce output streams that others can consume. The subscription-based programing model enables multiple users to deploy their own data-processing services. The runtime does the dynamic forwarding of data and execution of Service Objects from different users. Data streams can originate in real-world devices or they can be the outputs of Service Objects.
翻译:设备和传感器生成了不同地点和协议的数据流。 这些数据通常到达用于存储和处理流的中央平台。 处理可以实时进行, 转换和浓缩在现场进行, 但也可以在数据存储和在储存库中组织后进行。 在前一种情况下, 流处理技术需要根据数据操作; 在后一组分析和查询中, 通常使用。 本文引入一个运行时间, 动态地构建基于用户提供的代码的数据流处理表层。 这些动态表层是利用由消耗数据的应用所定义的数据订阅模型在飞行中建立起来的。 每个用户定义的处理单元都称为服务对象。 每个用户定义的处理单元都输入数据流, 并可能生成其他用户可以使用的输出流。 基于订阅的程序模型使多个用户能够部署自己的数据处理服务。 运行时间可以动态地从不同用户处传输数据和执行服务对象。 数据流可以来自现实世界设备, 也可以成为服务对象的输出 。