Developing new ways to estimate probabilities can be valuable for science, statistics, and engineering. By considering the information content of different output patterns, recent work invoking algorithmic information theory has shown that a priori probability predictions based on pattern complexities can be made in a broad class of input-output maps. These algorithmic probability predictions do not depend on a detailed knowledge of how output patterns were produced, or historical statistical data. Although quantitatively fairly accurate, a main weakness of these predictions is that they are given as an upper bound on the probability of a pattern, but many low complexity, low probability patterns occur, for which the upper bound has little predictive value. Here we study this low complexity, low probability phenomenon by looking at example maps, namely a finite state transducer, natural time series data, RNA molecule structures, and polynomial curves. Some mechanisms causing low complexity, low probability behaviour are identified, and we argue this behaviour should be assumed as a default in the real world algorithmic probability studies. Additionally, we examine some applications of algorithmic probability and discuss some implications of low complexity, low probability patterns for several research areas including simplicity in physics and biology, a priori probability predictions, Solomonoff induction and Occam's razor, machine learning, and password guessing.
翻译:开发预测概率的新方法对于科学、统计和工程来说可能很有价值。 通过考虑不同产出模式的信息内容,最近援引算法信息理论的工作表明,基于模式复杂性的先验概率预测可以在广泛的投入-产出地图类别中做出。这些算法概率预测并不取决于如何产生产出模式的详细知识,也不取决于历史统计数据。虽然在数量上相当准确,但这些预测的一个主要弱点是,它们被作为模式概率的上限,但许多低复杂性、低概率模式的出现,而高约束值对后者几乎没有预测价值。在这里,我们研究这种低复杂性、低概率现象,方法是通过查看一些地图,即有限的状态转换器、自然时间序列数据、RNA分子结构以及多数值曲线。一些造成低复杂性、低概率行为的机制被确定为真实世界算法概率研究中的默认。此外,我们研究了算法概率的一些应用,并讨论了低复杂性、低概率模式对若干研究领域的影响,包括精密的物理和先期核感、先期物理学和核感学的精确度、先期物理学和核感学的概率。