Real-world time series data are often generated from several sources of variation. Learning representations that capture the factors contributing to this variability enables a better understanding of the data via its underlying generative process and improves performance on downstream machine learning tasks. This paper proposes a novel generative approach for learning representations for the global and local factors of variation in time series. The local representation of each sample models non-stationarity over time with a stochastic process prior, and the global representation of the sample encodes the time-independent characteristics. To encourage decoupling between the representations, we introduce counterfactual regularization that minimizes the mutual information between the two variables. In experiments, we demonstrate successful recovery of the true local and global variability factors on simulated data, and show that representations learned using our method yield superior performance on downstream tasks on real-world datasets. We believe that the proposed way of defining representations is beneficial for data modelling and yields better insights into the complexity of real-world data.
翻译:实际世界时间序列数据往往是由多种变异来源产生的。通过学习表现来捕捉促成这种变异的因素,有助于通过基本的基因化过程更好地了解数据,提高下游机器学习任务的业绩。本文件提议了一种新型的基因化方法,用于对全球和当地时间序列变化因素的学习表现。每个抽样模型的局部代表性,随着时间变化过程的事先不固定,以及样本的全球代表性,编码了时间独立的特性。为了鼓励这些表达方法之间的脱钩,我们引入了反事实规范化,将两个变量之间的相互信息最大化。在实验中,我们展示了模拟数据真实的当地和全球变异因素的成功恢复,并表明使用我们的方法所学的表示在现实世界数据集的下游任务上产生优异性表现。我们认为,拟议的表述方法有助于数据建模,并使人们更好地了解现实世界数据的复杂性。