Irregularly-sampled time series (ITS) are native to high-impact domains like healthcare, where measurements are collected over time at uneven intervals. However, for many classification problems, only small portions of long time series are often relevant to the class label. In this case, existing ITS models often fail to classify long series since they rely on careful imputation, which easily over- or under-samples the relevant regions. Using this insight, we then propose CAT, a model that classifies multivariate ITS by explicitly seeking highly-relevant portions of an input series' timeline. CAT achieves this by integrating three components: (1) A Moment Network learns to seek relevant moments in an ITS's continuous timeline using reinforcement learning. (2) A Receptor Network models the temporal dynamics of both observations and their timing localized around predicted moments. (3) A recurrent Transition Model models the sequence of transitions between these moments, cultivating a representation with which the series is classified. Using synthetic and real data, we find that CAT outperforms ten state-of-the-art methods by finding short signals in long irregular time series.
翻译:定期抽样的时间序列(ITS)是诸如医疗保健等影响较大的领域的原生领域,其间收集的测量时间间隔不均。然而,对于许多分类问题,较长时间序列中只有一小部分往往与分类标签相关。在这种情况下,现有的ITS模型往往没有对长序列进行分类,因为它们依赖于仔细估算,很容易地将相关区域标出过大或过低。利用这种洞察力,我们然后建议CAT,这是一个通过明确寻找一个输入序列时间表中高度相关的部分来将ITS分类的多变量模型。 CAT通过整合三个组成部分来实现这一点:(1) 一个动态网络学会利用强化学习,在ITS的持续时间表中寻找相关时刻。(2) 一个受体网络模型,在预测时刻左右对观测的时间动态和时间进行本地化。(3) 一个经常性的过渡模型模型,这些时刻之间的过渡顺序,形成一个序列分类的代号。我们利用合成和真实数据,发现CAT通过在长期不规则的时间序列中找到短信号,从而超越了10个最先进的方法。