Time series are often complex and rich in information but sparsely labeled and therefore challenging to model. In this paper, we propose a self-supervised framework for learning generalizable representations for non-stationary time series. Our approach, called Temporal Neighborhood Coding (TNC), takes advantage of the local smoothness of a signal's generative process to define neighborhoods in time with stationary properties. Using a debiased contrastive objective, our framework learns time series representations by ensuring that in the encoding space, the distribution of signals from within a neighborhood is distinguishable from the distribution of non-neighboring signals. Our motivation stems from the medical field, where the ability to model the dynamic nature of time series data is especially valuable for identifying, tracking, and predicting the underlying patients' latent states in settings where labeling data is practically impossible. We compare our method to recently developed unsupervised representation learning approaches and demonstrate superior performance on clustering and classification tasks for multiple datasets.
翻译:时间序列往往复杂,信息丰富,但标签很少,因此对建模具有挑战性。 在本文中,我们提出一个自我监督的框架,用于学习非静止时间序列的通用表示。我们的方法叫做“时间邻里编码”(TNC),利用信号的基因过程在当地的顺利性,在固定特性的时间内界定邻里。我们的框架使用一种不偏颇的对比性目标,通过确保在编码空间中,来自社区内部的信号的分布与非邻里信号的分布区别开来,学习时间序列的表述。我们的积极性来自医疗领域,在这个领域,模拟时间序列数据的动态性质的能力对于识别、跟踪和预测在几乎不可能给数据贴上标签的环境中患者的潜在状态特别宝贵。我们比较了我们最近开发的未经监视的演示方法,并展示了多个数据集在集群和分类任务方面的优异性表现。