The exponential family is well known in machine learning and statistical physics as the maximum entropy distribution subject to a set of observed constraints, while the geometric mixture path is common in MCMC methods such as annealed importance sampling. Linking these two ideas, recent work has interpreted the geometric mixture path as an exponential family of distributions to analyze the thermodynamic variational objective (TVO). We extend these likelihood ratio exponential families to include solutions to rate-distortion (RD) optimization, the information bottleneck (IB) method, and recent rate-distortion-classification approaches which combine RD and IB. This provides a common mathematical framework for understanding these methods via the conjugate duality of exponential families and hypothesis testing. Further, we collect existing results to provide a variational representation of intermediate RD or TVO distributions as a minimizing an expectation of KL divergences. This solution also corresponds to a size-power tradeoff using the likelihood ratio test and the Neyman Pearson lemma. In thermodynamic integration bounds such as the TVO, we identify the intermediate distribution whose expected sufficient statistics match the log partition function.
翻译:在机器学习和统计物理学中,指数式组别是众所周知的指数式组别,它是受一系列观察到的限制因素限制的最大酶分布,而几何混合路径则在混合监测器方法中十分常见,例如麻醉重要性取样。结合这两种想法,最近的工作将几何混合路径解释为一个指数式组别,用来分析热力变异目标(TVO)。我们扩大这些概率比指数组别的范围,以包括使用概率比率测试和Neyman Pearson Lemma等热力集成方法的解决方案。在热力集成(RD和IB)中,这提供了一个共同的数学框架,通过指数式组别和假设测试的双重性来理解这些方法。此外,我们收集了现有结果,以提供中间RD或TVO分布的变异性表示,以尽量减少对KL差异的预期。这一解决办法还相当于使用概率比测试和Neyman Pearson Lemma等热力交换法。在热力集成框中,例如TVO,我们确定了预期足够的中间分布与日分区功能相匹配的中间组别。