Finite order Markov models are theoretically well-studied models for dependent discrete data. Despite their generality, application in empirical work when the order is large is rare. Practitioners avoid using higher order Markov models because (1) the number of parameters grow exponentially with the order and (2) the interpretation is often difficult. Mixture of transition distribution models (MTD) were introduced to overcome both limitations. MTD represent higher order Markov models as a convex mixture of single step Markov chains, reducing the number of parameters and increasing the interpretability. Nevertheless, in practice, estimation of MTD models with large orders are still limited because of curse of dimensionality and high algorithm complexity. Here, we prove that if only few lags are relevant we can consistently and efficiently recover the lags and estimate the transition probabilities of high-dimensional MTD models. The key innovation is a recursive procedure for the selection of the relevant lags of the model. Our results are based on (1) a new structural result of the MTD and (2) an improved martingale concentration inequality. We illustrate our method using simulations and a weather data.
翻译:Markov 模型在理论上是独立的离散数据模型。尽管这些模型具有一般性质,但在大量离散数据中应用的经验性工作是罕见的。从业者避免使用较高顺序的Markov模型,因为(1) 参数数随着顺序而成倍增长,(2) 解释往往很困难。为克服这两种限制,采用了过渡分配模型混合体(MTD)。MTD代表了较高顺序的Markov模型,作为单步马可夫链的螺旋混合物,减少了参数数量,增加了可解释性。然而,在实践上,对大订单的MTD模型的估计仍然有限,因为具有高度的维度和高度的算法复杂性。在这里,我们证明,如果只有很少的滞后点具有相关性,我们就能够持续和有效地恢复滞后,并且估计高维度MTD 模型的过渡概率。关键创新是选择模型相关滞后的循环程序。我们的结果基于(1) MTD 新的结构结果,(2) 改进马丁加勒的浓度不平等。我们用模拟和天气数据来说明我们的方法。</s>