Relational autocompletion is the problem of automatically filling out some missing values in multi-relational data. We tackle this problem within the probabilistic logic programming framework of Distributional Clauses (DC), which supports both discrete and continuous probability distributions. Within this framework, we introduce DiceML { an approach to learn both the structure and the parameters of DC programs from relational data (with possibly missing data). To realize this, DiceML integrates statistical modeling and distributional clauses with rule learning. The distinguishing features of DiceML are that it 1) tackles autocompletion in relational data, 2) learns distributional clauses extended with statistical models, 3) deals with both discrete and continuous distributions, 4) can exploit background knowledge, and 5) uses an expectation-maximization based algorithm to cope with missing data. The empirical results show the promise of the approach, even when there is missing data.
翻译:关系自动完成是自动填充多关系数据中某些缺失值的问题。 我们在分配条款(DC)的概率逻辑逻辑编程框架内解决这一问题,它既支持离散的概率分布,又支持连续的概率分布。在此框架内,我们引入了 DiceML { 方法,从关系数据(可能缺少数据)中学习DC方案的结构和参数。为了实现这一点, DiceML 将统计模型和分配条款与规则学习结合起来。 DiceML 的显著特征是:(1) 处理关系数据的自动完成,(2) 学习与统计模型相扩展的分配条款,(3) 处理离散和连续的分布,(4) 利用背景知识,(5) 使用基于期望的算法处理缺失数据。经验结果显示这一方法的前景,即使缺少数据。