Generating multivariate time series is a promising approach for sharing sensitive data in many medical, financial, and IoT applications. A common type of multivariate time series originates from a single source such as the biometric measurements from a medical patient. This leads to complex dynamical patterns between individual time series that are hard to learn by typical generation models such as GANs. There is valuable information in those patterns that machine learning models can use to better classify, predict or perform other downstream tasks. We propose a novel framework that takes time series' common origin into account and favors channel/feature relationships preservation. The two key points of our method are: 1) the individual time series are generated from a common point in latent space and 2) a central discriminator favors the preservation of inter-channel/feature dynamics. We demonstrate empirically that our method helps preserve channel/feature correlations and that our synthetic data performs very well in downstream tasks with medical and financial data.
翻译:生成多变时间序列是分享许多医疗、财务和IoT应用中敏感数据的一个很有希望的方法。一种常见的多变时间序列来自单一来源,例如来自病人的生物测定测量。这导致单个时间序列之间复杂的动态模式,而典型的一代模式,如GANs,很难了解。在这些模式中,机器学习模式可以用来更好地分类、预测或执行其他下游任务的宝贵信息。我们提出了一个新颖的框架,考虑到时间序列的共同起源,有利于频道/功能关系保护。我们方法的两个关键点是:(1) 单个时间序列来自潜在空间的一个共同点,(2) 中心歧视者赞成保护频道/功能动态。我们从经验上表明,我们的方法有助于保存通道/功能的关联性,我们的合成数据在下游任务中与医疗和财务数据的关系非常密切。