Even simply-defined, finite-state generators produce stochastic processes that require tracking an uncountable infinity of probabilistic features for optimal prediction. For processes generated by hidden Markov chains the consequences are dramatic. Their predictive models are generically infinite-state. And, until recently, one could determine neither their intrinsic randomness nor structural complexity. The prequel, though, introduced methods to accurately calculate the Shannon entropy rate (randomness) and to constructively determine their minimal (though, infinite) set of predictive features. Leveraging this, we address the complementary challenge of determining how structured hidden Markov processes are by calculating their statistical complexity dimension -- the information dimension of the minimal set of predictive features. This tracks the divergence rate of the minimal memory resources required to optimally predict a broad class of truly complex processes.
翻译:即使是简单定义的、 限定状态的发电机也会产生随机过程, 需要跟踪无法预测的概率特性的无限性, 以便进行最佳预测。 对于隐藏的Markov 链条产生的过程来说, 后果是巨大的。 它们的预测模型一般是无限的。 直到最近, 人们还不能确定它们的内在随机性和结构复杂性。 不过, 最初采用的方法是精确计算香农的通心率( 随机性), 并建设性地确定它们最起码( 虽然无限) 的预测特征。 我们利用这个方法, 通过计算其统计复杂性, 来应对确定结构化的隐藏Markov 过程是如何的互补挑战, 即最小的一套预测特征的信息层面。 这追踪了最佳预测一系列真正复杂的过程所需的最低记忆资源的差异率 。