Undirected probabilistic graphical models represent the conditional dependencies, or Markov properties, of a collection of random variables. Knowing the sparsity of such a graphical model is valuable for modeling multivariate distributions and for efficiently performing inference. While the problem of learning graph structure from data has been studied extensively for certain parametric families of distributions, most existing methods fail to consistently recover the graph structure for non-Gaussian data. Here we propose an algorithm for learning the Markov structure of continuous and non-Gaussian distributions. To characterize conditional independence, we introduce a score based on integrated Hessian information from the joint log-density, and we prove that this score upper bounds the conditional mutual information for a general class of distributions. To compute the score, our algorithm SING estimates the density using a deterministic coupling, induced by a triangular transport map, and iteratively exploits sparse structure in the map to reveal sparsity in the graph. For certain non-Gaussian datasets, we show that our algorithm recovers the graph structure even with a biased approximation to the density. Among other examples, we apply sing to learn the dependencies between the states of a chaotic dynamical system with local interactions.
翻译:无方向概率图形模型代表随机变量集的有条件依赖性或Markov属性。了解这种图形模型的宽度对于模拟多变量分布和高效进行推断很有价值。虽然从数据中学习图形结构的问题已经对某些分布的参数组进行了广泛研究,但大多数现有方法都未能始终如一地恢复非高加索数据的图形结构。在这里,我们建议了一种算法,用于学习连续和非Gausian分布的Markov结构。为了描述有条件的独立特征,我们引入了基于联合日志密度的海珊综合信息的分数,我们证明,这一分数是一般分布类别的有条件共同信息的上限。要计算分数,我们的算法SING估计密度是使用三角运输图引出的确定性组合,并反复利用地图中的稀薄结构来显示图形的宽度。对于某些非Gausian数据集,我们显示我们的算法恢复了图表结构,即使以偏差的精确度接近度与当地密度的对比。我们用其他的例子来学习。