We consider Gibbs distributions, which are families of probability distributions over a discrete space $\Omega$ with probability mass function of the form $\mu^\Omega_\beta(\omega) \propto e^{\beta H(\omega)}$ for $\beta$ in an interval $[\beta_{\min}, \beta_{\max}]$ and $H(\omega) \in \{0 \} \cup [1, n]$. The \emph{partition function} is the normalization factor $Z(\beta)=\sum_{\omega\in\Omega}e^{\beta H(\omega)}$. Two important parameters of these distributions are the partition ratio $q = \log \tfrac{Z(\beta_{\max})}{Z(\beta_{\min})}$ and the counts $c_x = |H^{-1}(x)|$. These are correlated with system parameters in a number of physical applications and sampling algorithms. Our first main result is to estimate the values $c_x$ using roughly $\tilde O( \frac{q}{\varepsilon^2})$ samples for general Gibbs distributions and $\tilde O( \frac{n^2}{\varepsilon^2} )$ samples for integer-valued distributions (ignoring some second-order terms and parameters), and we show this is optimal up to logarithmic factors. We illustrate with improved algorithms for counting connected subgraphs and perfect matchings in a graph. A key subroutine we develop is to estimate the partition function $Z$; specifically, we generate a data structure capable of estimating $Z(\beta)$ for all values $\beta$, without further samples. Constructing the data structure requires $\tilde O(\frac{q}{\varepsilon^2})$ samples for general Gibbs distributions and $\tilde O(\frac{n^2}{\varepsilon^2})$ samples for integer-valued distributions. This improves over a prior algorithm of Kolmogorov (2018) which computes the single point estimate $Z(\beta_{\max})$ using $\tilde O(\frac{q}{\varepsilon^2})$ samples. We show matching lower bounds, demonstrating that this complexity is optimal as a function of $n$ and $q$ up to logarithmic terms.
翻译:我们考虑 Gibbs 分布方式, 以离散空间的概率分配值 $\\ 美元=Omega =bega $, 其概率分配值在离散空间 $\ 美元=Omega =beta H(\ omega)} 美元和 $H( beta) 美元 和 美元 (beta) =% 0 \ cup [1, n] 美元。 comph{ 运算值 {emph{ 分割函数} 其概率分配系数在形式上 $m\\ =Omega\ =beta = 美元 =obetata H(\ omga)} 美元。 这些分配方式的两大参数是 $q = log =\ coq = 美元= 美元= 美元= 美元= 美元 美元 = 美元 美元=xx 美元 美元 美元 美元=x 美元 美元=x 美元 美元 美元=x dismax 数据=xxx =x =x =xxxxx = = = = = = 美元=xxxxxxxxxxxxx = = = = = =xxxxxxxxxxxxxx = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx