Log-linear models are a family of probability distributions which capture relationships between variables, including context-specific independencies. Many approaches exist for automatic learning of their independence structures from data, although the only known methods for evaluating these approaches are indirect measures of their complete density. This requires additional learning of numerical parameters, and introduces distortions when used for comparing structures. This work addresses this issue by presenting a measure for the direct and efficient comparison of independence structures of log-linear models. We present proof that the measure is a metric, and a method for its computation that is efficient in the number of variables of the domain. Efficiency in the number of features in the models is not guaranteed and will be the subject of future work.
翻译:逻辑线性模型是概率分布的组合,它捕捉变量之间的关系,包括因地制宜的相互依存关系。许多办法都存在,可以自动从数据中学习其独立性结构,尽管唯一已知的评价这些办法的方法是其完全密度的间接测量。这要求额外学习数字参数,并在比较结构时引入扭曲。这项工作通过提出直接和有效地比较日志-线性模型独立结构的尺度来解决这个问题。我们提出证据,证明这一尺度是一种衡量尺度,其计算方法对于领域变量的数量是有效的。模型特征数量的效率没有保障,将成为未来工作的主题。