Inspired by Hilberg's hypothesis, which states that mutual information between blocks for natural language grows like a power law, we seek for links between power-law growth rate of algorithmic mutual information and of some estimator of the unifilar order, i.e., the number of hidden states in the generating stationary ergodic source in its minimal unifilar hidden Markov representation. We consider an order estimator which returns the smallest order for which the maximum likelihood is larger than a weakly penalized universal probability. This order estimator is intractable and follows the ideas by Merhav, Gutman, and Ziv (1989) and by Ziv and Merhav (1992) but in its exact form seems overlooked despite attractive theoretical properties. In particular, we can prove both strong consistency of this order estimator and an upper bound of algorithmic mutual information in terms of it. Using both results, we show that all (also uncomputable) sources of a finite unifilar order exhibit sub-power-law growth of algorithmic mutual information and of the unifilar order estimator. In contrast, we also exhibit an example of unifilar processes of a countably infinite order and an algorithmically random oracle, for which the mentioned two quantities grow as a power law with the same exponent. We also relate our results to natural language research.
翻译:在希尔伯格的假设启发下,自然语言区块之间的相互信息像权力法一样增长,我们寻求在算法相互信息的权力法增长率和某些非虚拟秩序的估算者之间建立联系,即,生成定序源中隐藏的定序源数,以其最小的隐隐隐的Markov为代表。我们考虑一个定序估计器,该定序返回最小的顺序,其最大可能性大于微弱的受处罚的普遍概率。这个定序测量器非常棘手,遵循Merhav、Gutman和Ziv(1989年)以及Ziv和Merhav(1992年)的理念,但尽管理论性质具有吸引力,却似乎完全被忽视了。特别是,我们既可以证明该定序标码源的高度一致,又可以证明它具有最上层的逻辑相互信息。我们用两个结果来证明,一个(同样无法令人接受的)定序的所有(不可辩驳的)来源显示了算法共同信息以及不可估量的顺序的次权法增长。在对比中,我们用一个无限的自然序列的模型展示了我们所引用的自然秩序,一个无限的模型。