With the popularity of Internet of Things (IoT), edge computing and cloud computing, more and more stream analytics applications are being developed including real-time trend prediction and object detection on top of IoT sensing data. One popular type of stream analytics is the recurrent neural network (RNN) deep learning model based time series or sequence data prediction and forecasting. Different from traditional analytics that assumes data are available ahead of time and will not change, stream analytics deals with data that are being generated continuously and data trend/distribution could change (a.k.a. concept drift), which will cause prediction/forecasting accuracy to drop over time. One other challenge is to find the best resource provisioning for stream analytics to achieve good overall latency. In this paper, we study how to best leverage edge and cloud resources to achieve better accuracy and latency for stream analytics using a type of RNN model called long short-term memory (LSTM). We propose a novel edge-cloud integrated framework for hybrid stream analytics that supports low latency inference on the edge and high capacity training on the cloud. To achieve flexible deployment, we study different approaches of deploying our hybrid learning framework including edge-centric, cloud-centric and edge-cloud integrated. Further, our hybrid learning framework can dynamically combine inference results from an LSTM model pre-trained based on historical data and another LSTM model re-trained periodically based on the most recent data. Using real-world and simulated stream datasets, our experiments show the proposed edge-cloud deployment is the best among all three deployment types in terms of latency. For accuracy, the experiments show our dynamic learning approach performs the best among all learning approaches for all three concept drift scenarios.
翻译:随着Tings(IoTM)互联网的普及,边际计算和云计算,越来越多的流式分析应用正在开发,包括实时趋势预测和在IoT感测数据顶部进行对象探测。流式分析的一种最受欢迎的类型是经常性神经网络(RNN)深学习模型基于时间序列或序列数据预测和预测。不同于传统的分析,即假设数据可以提前获得并且不会改变,流式分析涉及正在不断生成的数据,而数据边际趋势/分布可能会改变(a.k.a.概念漂移),这将导致预测/预测准确性在IoT感测数据顶端上下降。另一个挑战是如何找到流式分析的最佳资源提供方,以实现总体通度。在本论文中,我们研究如何最佳地优势和云性资源,以便实现更准确性和清晰度的流分析,使用一种称为长期记忆的模型(LSTM)方法。我们提出在混合流式流式流式流流分析中建立一个新的边缘集集成集成综合框架框架,用以支持最近的延迟性潜度,在边端和高容量数据上进行新的云性部署学习。