Time series data occurs widely, and outlier detection is a fundamental problem in data mining, which has numerous applications. Existing autoencoder-based approaches deliver state-of-the-art performance on challenging real-world data but are vulnerable to outliers and exhibit low explainability. To address these two limitations, we propose robust and explainable unsupervised autoencoder frameworks that decompose an input time series into a clean time series and an outlier time series using autoencoders. Improved explainability is achieved because clean time series are better explained with easy-to-understand patterns such as trends and periodicities. We provide insight into this by means of a post-hoc explainability analysis and empirical studies. In addition, since outliers are separated from clean time series iteratively, our approach offers improved robustness to outliers, which in turn improves accuracy. We evaluate our approach on five real-world datasets and report improvements over the state-of-the-art approaches in terms of robustness and explainability. This is an extended version of "Robust and Explainable Autoencoders for Unsupervised Time Series Outlier Detection", to appear in IEEE ICDE 2022.
翻译:时间序列数据广泛出现,外线检测是数据开采中的一个基本问题,数据开采有许多应用。现有的基于自动编码器的方法在具有挑战性的现实世界数据中提供最先进的表现,但容易受到外部线的干扰,并表现出低的解释性。为了解决这两个限制,我们提出了强健和可解释的、不受监督的自动编码器框架,将输入时间序列分解成清洁的时间序列,并使用自动编码器进行外部时间序列。改进了可解释性,因为清洁的时间序列以易于理解的方式更好地解释,例如趋势和周期性等。我们通过一个后同步性分析和经验研究提供对此的洞察力。此外,由于外部线与清洁的时间序列相互分离,我们的方法提高了外部线的稳健性,这反过来又提高了准确性。我们评估了我们关于五个真实世界数据集的方法,并报告了在坚固性和可解释性方面对最新方法的改进情况。这是“不可靠的IB4和可解释性可解释性自动编码器”的扩展版本,在“不精确的IEE-E-E-OD”中显示“I-E-E-I-E-E-E-E-E-E-E-E-E-E-E-E-E-E-E-E-E-E-O-E-E-E-E-O-E-E-E-E-E-E-E-E-E-E-E-E-O-O-E-E-E-O-O-O-O-O-E-E-O-O-E-I-I-I-I-I-I-I-I-I-IS-IS-IS-IS-IS-IS-I-IS-IS-IS-IS-IS-IS-IS-IS-IS-IS-IS-IS-IS-IS-IS-IS-IS-IS-IS-IS-IS-IS-IS-IS-IS-IS-IS-IS-IS-IS-IS-IS-IS-IS-IS-IS-IS-IS-IS-IS-IS-IS-IS-IS-IS-IS-IS-IS-IS-IS-IS-IS-IS-IS-IS-IS-IS-IS-IS-IS-IS-IS-IS-IS-IS-IS-IS