Real-valued time series are ubiquitous in the sciences and engineering. In this work, a general, hierarchical Bayesian modelling framework is developed for building mixture models for times series. This development is based, in part, on the use of context trees, and it includes a collection of effective algorithmic tools for learning and inference. A discrete context (or 'state') is extracted for each sample, consisting of a discretised version of some of the most recent observations preceding it. The set of all relevant contexts are represented as a discrete context-tree. At the bottom level, a different real-valued time series model is associated with each context-state, i.e., with each leaf of the tree. This defines a very general framework that can be used in conjunction with any existing model class to build flexible and interpretable mixture models. Extending the idea of context-tree weighting leads to algorithms that allow for efficient, exact Bayesian inference in this setting. The utility of the general framework is illustrated in detail when autoregressive (AR) models are used at the bottom level, resulting in a nonlinear AR mixture model. The associated methods are found to outperform several state-of-the-art techniques on simulated and real-world experiments.
翻译:时序数据在科学和工程领域中无处不在。在本文中,我们开发了一个通用的、层次化的贝叶斯建模框架来构建时序数据的混合模型。这个框架的开发部分基于上下文树的使用,并包括一些有效的算法工具,用于学习和推断。对于每个样本,都提取出一个离散的上下文(或“状态”),由其之前的一些最近观测结果的离散版本组成。所有相关上下文的集合表示为一个离散的上下文树。在底层,一个不同的实值时序模型与每个上下文状态(即每个叶子节点)相关联。这定义了一个非常通用的框架,可以与任何现有的模型类一起使用,构建灵活且易解释的混合模型。扩展上下文树加权的想法可以得到一些算法,允许在此设置中进行高效的精确贝叶斯推断。当在底层使用自回归(AR)模型时,通用框架的效用得到了详细的论证,从而得到了一个非线性AR混合模型。所提供的方法被发现在模拟和现实世界的实验中优于几种最先进的技术。