Identifying anomalies in large multi-dimensional time series is a crucial and difficult task across multiple domains. Few methods exist in the literature that address this task when some of the variables are categorical in nature. We formalize an analogy between categorical time series and classical Natural Language Processing and demonstrate the strength of this analogy for anomaly detection and root cause investigation by implementing and testing three different machine learning anomaly detection and root cause investigation models based upon it.
翻译:查明大型多维时间序列中的异常现象是跨越多个领域的关键和困难的任务。 当某些变量具有绝对性时,文献中处理这项任务的方法很少。 我们正式确定绝对时间序列和经典自然语言处理之间的类比,并通过实施和测试三个不同的机器学习异常现象探测和基于它的根本原因调查模型,来证明这一类比对异常现象探测和根本原因调查的力度。