统计关系模型的预测性完整特征 (A Complete Characterization of Projectivity for Statistical Relational Models)

A generative probabilistic model for relational data consists of a family of probability distributions for relational structures over domains of different sizes. In most existing statistical relational learning (SRL) frameworks, these models are not projective in the sense that the marginal of the distribution for size-$n$ structures on induced sub-structures of size $k<n$ is equal to the given distribution for size-$k$ structures. Projectivity is very beneficial in that it directly enables lifted inference and statistically consistent learning from sub-sampled relational structures. In earlier work some simple fragments of SRL languages have been identified that represent projective models. However, no complete characterization of, and representation framework for projective models has been given. In this paper we fill this gap: exploiting representation theorems for infinite exchangeable arrays we introduce a class of directed graphical latent variable models that precisely correspond to the class of projective relational models. As a by-product we also obtain a characterization for when a given distribution over size-$k$ structures is the statistical frequency distribution of size-$k$ sub-structures in much larger size-$n$ structures. These results shed new light onto the old open problem of how to apply Halpern et al.'s "random worlds approach" for probabilistic inference to general relational signatures.

翻译：用于关系数据的遗传性概率模型包括不同大小领域关系结构的概率分布组合。在大多数现有的统计关系学习框架(SRL)中,这些模型并不是预测性的,因为在引致规模为美元单位的亚结构中,大小-美元结构的分布边际相当于大小-美元结构的给定分配额。投影性非常有益,因为它直接促成从次级抽样关系结构中解析推论和统计上一致的学习。在早期工作中,已经确定了一些反映投影模型的SRL语言的简单碎片。然而,没有提供投影模型的完整特征和代表框架。在本文中,我们填补了这一空白:在无限可交换的阵列中,我们采用了与投影关系模型类别完全相符的一组直接的图形潜在变量模型。作为副产品,当给定大小-美元结构的分布时,我们还得到了一个特征,即SSRL语言中一些代表投影模型的简单碎片。但是,没有给出了投影模型的完整特征,也没有给出其代表框架。在投影模型中,没有给出完整的大小-k$值分数分数的分机结构中的统计频率分布如何适用于新的数字。这些在普通结构中,在大尺寸和直角结构中应用中,这些直径结构中,如何对新结构中,如何对新结构应用。