We consider extensions of the Shannon relative entropy, referred to as f-divergences. Three classical related computational problems are typically associated with these divergences: (a) estimation from moments, (b) computing normalizing integrals, and (c) variational inference in probabilistic models. These problems are related to one another through convex duality, and for all them, there are many applications throughout data science, and we aim for computationally tractable approximation algorithms that preserve properties of the original problem such as potential convexity or monotonicity. In order to achieve this, we derive a sequence of convex relaxations for computing these divergences from non-centered covariance matrices associated with a given feature vector: starting from the typically non-tractable optimal lower-bound, we consider an additional relaxation based on ''sums-of-squares'', which is is now computable in polynomial time as a semidefinite program, as well as further computationally more efficient relaxations based on spectral information divergences from quantum information theory. For all of the tasks above, beyond proposing new relaxations, we derive tractable algorithms based on augmented Lagrangians and first-order methods, and we present illustrations on multivariate trigonometric polynomials and functions on the Boolean hypercube.
翻译:我们考虑香农相对诱变的延伸,称为 f-diverences 。 三种典型相关的计算问题通常与这些差异相关:(a) 从瞬间估算, (b) 计算成正常整体体, (c) 概率模型中的变推论。 这些问题通过共性双重性彼此相关, 而对于所有这些问题来说, 数据科学中有许多应用, 我们的目标是可计算可移动的近似算法, 以保存原始问题( 如潜在共性或单调性)的特性。 为了实现这一点, 我们从一个与特定特性矢量相关的非中心共变矩阵计算这些差异, (a) 从一个通常不可吸引的最佳较低约束的特性矢量开始, 我们考虑根据“ 组合” 进行额外的放松, 现在, 以一个半定数时间为可比较的可计算近似近似值算法。 为了实现这一点, 我们从一个量量信息理论的分光信息差异中得出一系列的共通度放松。 对于所有超常量级的超常量性次数的变法, 我们提出新的调整, 以目前的超常数级的递增级的次的次级的次级的递增级的亚。