In the rate-distortion function and the Maximum Entropy (ME) method, Minimum Mutual In-formation (MMI) distributions and ME distributions are expressed by Bayes-like formulas, in-cluding Negative Exponential Functions (NEFs) and partition functions. Why do these non-probability functions exist in Bayes-like formulas? On the other hand, the rate-distortion function has three disadvantages: (1) the distortion function is subjectively defined; (2) the defi-nition of the distortion function between instances and labels is often difficult; (3) it cannot be used for data compression according to the labels' semantic meanings. The author has proposed using the semantic information G measure with both statistical probability and logical probability before. We can now explain NEFs as truth functions, partition functions as logical probabilities, Bayes-like formulas as semantic Bayes' formulas, MMI as Semantic Mutual Information (SMI), and ME as extreme ME minus SMI. In overcoming the above disadvantages, this paper sets up the relationship between truth functions and distortion functions, obtains truth functions from samples by machine learning, and constructs constraint conditions with truth functions to extend rate-distortion functions. Two examples are used to help readers understand the MMI iteration and to support the theoretical results. Using truth functions and the semantic information G measure, we can combine machine learning and data compression, including semantic com-pression. We need further studies to explore general data compression and recovery, according to the semantic meaning.
翻译:在比率扭曲函数和最大变异(ME) 方法中, 最小相互变异( MMI) 分布和 ME 分布以贝亚式公式、 隐含负指数函数( NEFs) 和分区函数表示。 为什么这些非概率函数存在于贝亚式公式中? 另一方面, 率扭曲函数有三个缺点:(1) 扭曲函数是主观定义的; (2) 实例和标签之间扭曲功能的去菲功能往往是困难的; (3) 它不能用于按照标签的语义含义进行数据压缩。 作者提议使用语义信息G 测量, 具有统计概率和逻辑概率。 我们现在可以将这些 NEF 解释为真理函数, 偏差函数作为逻辑概率, 类似语义公式, MMI 作为语义公式, MMI 作为语义互助支持, 以及 ME 作为极端的 MI 。 在克服上述不利因素时, 本文将真理函数( 包括数据缩略性函数 ) 和扭曲函数( ) 用来构建真实性、 解析的数学函数 、 和 解析变变的数学函数 。