Edge computing enabled smart greenhouse is a representative application of Internet of Things technology, which can monitor the environmental information in real time and employ the information to contribute to intelligent decision-making. In the process, anomaly detection for wireless sensor data plays an important role. However, traditional anomaly detection algorithms originally designed for anomaly detection in static data have not properly considered the inherent characteristics of data stream produced by wireless sensor such as infiniteness, correlations and concept drift, which may pose a considerable challenge on anomaly detection based on data stream, and lead to low detection accuracy and efficiency. First, data stream usually generates quickly which means that it is infinite and enormous, so any traditional off-line anomaly detection algorithm that attempts to store the whole dataset or to scan the dataset multiple times for anomaly detection will run out of memory space. Second, there exist correlations among different data streams, which traditional algorithms hardly consider. Third, the underlying data generation process or data distribution may change over time. Thus, traditional anomaly detection algorithms with no model update will lose their effects. Considering these issues, a novel method (called DLSHiForest) on basis of Locality-Sensitive Hashing and time window technique in this paper is proposed to solve these problems while achieving accurate and efficient detection. Comprehensive experiments are executed using real-world agricultural greenhouse dataset to demonstrate the feasibility of our approach. Experimental results show that our proposal is practicable in addressing challenges of traditional anomaly detection while ensuring accuracy and efficiency.
翻译:具有代表性的智能计算功能智能温室是Things Internet的具有代表性的智能温室,它能够实时监测环境信息,并利用信息促进智能决策。在这一过程中,对无线传感器数据进行异常探测具有重要作用。然而,最初为在静态数据中检测异常现象而设计的传统异常探测算法,没有适当考虑由无线传感器产生的数据流的内在特征,如无限性、相关性和概念漂移,这可能对基于数据流的异常探测构成相当大的挑战,并导致检测准确性和效率低。首先,数据流通常会迅速生成,这意味着其无限和巨大,因此,任何传统的离线异常探测算法,试图存储整个数据集或扫描数据集以探测异常现象的多重时间,都将超出记忆空间。第二,不同数据流之间存在关联性,传统算法几乎不考虑。第三,潜在的数据生成过程或数据传播过程可能会随着时间的推移而变化。因此,没有模型更新的传统异常检测算法将失去效果。考虑到这些问题,一种新颖的方法(称为DLShiforest)意味着它无穷无穷无穷无穷无穷无穷,因此,因此,任何传统的异常现象探测法探测法探测法,因此试图存储整个数据集测算算算法将耗竭测算算算算算算算算算算算算法,而要用我们的准确性测算算算算算算法,而要用我们的精确性测法,而保证我们测法的精确性测法,而要用这种测法,而用这种测法,而用这种测法则要用这种测法在保证实际性测法的精确性测算法的精确性测算法,而用这种测算法,而要用这种测法方法,而用这种测算法则在测算算算法,在测算法方法,在保证我们测算法的精确性测算法,在测法的精确性测法,在测法,在保证我们测法性测法性测法的精确测法性测法性测法性测法方法是在测法,而用法性测算法性测法性测法方法,而测法方法是要在测法的精确性地性地性地性地性测法则在保证我们地性测法,用。