In this paper, we propose a novel equivalence between probability theory and information theory. For a single random variable, Shannon's self-information, $I=-\log {p}$, is an alternative expression of a probability $p$. However, for two random variables, no information equivalent to the $p$-value has been identified. Here, we prove theorems that demonstrate that mutual information (MI) is equivalent to the $p$-value irrespective of prior information about the distribution of the variables. If the maximum entropy principle can be applied, our equivalence theorems allow us to readily compute the $p$-value from multidimensional MI. By contrast, in a contingency table of any size with known marginal frequencies, our theorem states that MI asymptotically coincides with the logarithm of the $p$-value of Fisher's exact test, divided by the sample size. Accordingly, the theorems enable us to perform a meta-analysis to accurately estimate MI with a low $p$-value, thereby calculating informational interdependence that is robust against sample size variation. Thus, our theorems demonstrate the equivalence of the $p$-value and MI at every dimension, use the merits of both, and provide fundamental information for integrating probability theory and information theory.
翻译:暂无翻译