Both the volume and the collection velocity of time series generated by monitoring sensors are increasing in the Internet of Things (IoT). Data management and analysis requires high quality and applicability of the IoT data. However, errors are prevalent in original time series data. Inconsistency in time series is a serious data quality problem existing widely in IoT. Such problem could be hardly solved by existing techniques. Motivated by this, we define an inconsistent subsequences problem in multivariate time series, and propose an integrity data repair approach to solve inconsistent problems. Our proposed repairing method consists of two parts: (1) we design effective anomaly detection method to discover latent inconsistent subsequences in the IoT time series; and (2) we develop repair algorithms to precisely locate the start and finish time of inconsistent intervals, and provide reliable repairing strategies. A thorough experiment on two real-life datasets verifies the superiority of our method compared to other practical approaches. Experimental results also show that our method captures and repairs inconsistency problems effectively in industrial time series in complex IIoT scenarios.
翻译:监测传感器产生的时间序列的量和收集速度在物联网(IoT)中都在增加。数据管理和分析要求IoT数据的高质量和可应用性。然而,在最初的时间序列数据中,错误普遍存在。时间序列的不一致是IoT中广泛存在的一个严重的数据质量问题。这个问题很难通过现有技术来解决。我们受此驱动,在多变时间序列中确定了一个前后不一致的子序列问题,并提出了一种完整的数据修复方法来解决不一致的问题。我们提议的修复方法由两部分组成:(1) 我们设计有效的异常探测方法,以发现IoT时间序列中潜在的不一致的子序列;(2) 我们开发修复算法,以精确确定前后不一的间隔的起始时间和结束时间,并提供可靠的修复战略。对两个真实的数据集的彻底实验证实了我们方法与其他实际方法相比的优越性。实验结果还表明,在复杂的IIoT情景中,我们的方法捕捉取和修复了工业时间序列中的不一致问题。