Estimating free energy differences, an important problem in computational drug discovery and in a wide range of other application areas, commonly involves a computationally intensive process of sampling a family of high-dimensional probability distributions and a procedure for computing estimates based on those samples. The variance of the free energy estimate of interest typically depends strongly on how the total computational resources available for sampling are divided among the distributions, but determining an efficient allocation is difficult without sampling the distributions. Here we introduce the Times Square sampling algorithm, a novel on-the-fly estimation method that dynamically allocates resources in such a way as to significantly accelerate the estimation of free energies and other observables, while providing rigorous convergence guarantees for the estimators. We also show that it is possible, surprisingly, for on-the-fly free energy estimation to achieve lower asymptotic variance than the maximum-likelihood estimator MBAR, raising the prospect that on-the-fly estimation could reduce variance in a variety of other statistical applications.
翻译:估计自由能源差异是计算药物发现和广泛其他应用领域的一个重要问题,通常涉及对高维概率分布大家庭进行抽样的计算密集过程和根据这些样品计算估计数的程序。自由能源估计利息的差异通常在很大程度上取决于用于抽样的总计算资源如何在分布之间分配,但如果不对分布进行抽样,很难确定有效的分配。这里我们介绍Times Square抽样算法,这是一种新型的现场估计方法,动态地分配资源,以大大加快对自由能源和其他观察的估算,同时为估计者提供严格的趋同保证。我们还表明,令人惊讶的是,在飞行上自由估计能源有可能实现比最大类似估计者MBAR低的无源差异,从而有可能使天估计能够减少其他统计应用的差异。