Time-series representation learning is a fundamental task for time-series analysis. While significant progress has been made to achieve accurate representations for downstream applications, the learned representations often lack interpretability and do not expose semantic meanings. Different from previous efforts on the entangled feature space, we aim to extract the semantic-rich temporal correlations in the latent interpretable factorized representation of the data. Motivated by the success of disentangled representation learning in computer vision, we study the possibility of learning semantic-rich time-series representations, which remains unexplored due to three main challenges: 1) sequential data structure introduces complex temporal correlations and makes the latent representations hard to interpret, 2) sequential models suffer from KL vanishing problem, and 3) interpretable semantic concepts for time-series often rely on multiple factors instead of individuals. To bridge the gap, we propose Disentangle Time Series (DTS), a novel disentanglement enhancement framework for sequential data. Specifically, to generate hierarchical semantic concepts as the interpretable and disentangled representation of time-series, DTS introduces multi-level disentanglement strategies by covering both individual latent factors and group semantic segments. We further theoretically show how to alleviate the KL vanishing problem: DTS introduces a mutual information maximization term, while preserving a heavier penalty on the total correlation and the dimension-wise KL to keep the disentanglement property. Experimental results on various real-world benchmark datasets demonstrate that the representations learned by DTS achieve superior performance in downstream applications, with high interpretability of semantic concepts.
翻译:时间序列代表制学习是时间序列分析的一项基本任务。虽然在为下游应用实现准确表述方面取得了显著进展,但所学的表述方式往往缺乏解释性,没有暴露语义含义。不同于以往在纠缠的地貌空间上的努力,我们的目标是在数据的潜在可解释因子化代表制中提取语义丰富的时间序列相关性。由于计算机愿景中解析性代表制学习的成功,我们研究了学习语义丰富的时间序列代表制的可能性。由于三大挑战,这些解释性表述方式仍未得到探讨:(1) 相继数据结构引入复杂的时间关系,使潜在的表达方式难以解释;(2) 相继模型存在KL消失的问题;(3) 时间序列的可解释性概念往往依赖多种因素,而不是个人。为了弥合这一差距,我们提议对时间序列数据进行新的分解性增强框架。具体而言,生成等级的语义性系概念,作为时间序列的可解释性和不相交错的表述方式,DTT在多层次上引入不相交错的描述方式,同时通过覆盖单个的正深层数据理解性解释,同时展示一个双向级的递变的递变的递变的逻辑结构,同时展示一个隐性定义。