Joint probability mass function (PMF) estimation is a fundamental machine learning problem. The number of free parameters scales exponentially with respect to the number of random variables. Hence, most work on nonparametric PMF estimation is based on some structural assumptions such as clique factorization adopted by probabilistic graphical models, imposition of low rank on the joint probability tensor and reconstruction from 3-way or 2-way marginals, etc. In the present work, we link random projections of data to the problem of PMF estimation using ideas from tomography. We integrate this idea with the idea of low-rank tensor decomposition to show that we can estimate the joint density from just one-way marginals in a transformed space. We provide a novel algorithm for recovering factors of the tensor from one-way marginals, test it across a variety of synthetic and real-world datasets, and also perform MAP inference on the estimated model for classification.
翻译:联合概率质量函数(PMF)估算是一个根本性的机器学习问题。相对于随机变量的数量而言,自由参数比例指数是指数化的。因此,大多数关于非参数PMF估算的工作都基于一些结构性假设,例如概率图形模型采用的分级系数化、对联合概率高的定级、从三向或双向边缘重建等。在目前的工作中,我们利用地形学的理念,将随机预测数据与PMF估算问题联系起来。我们把这一想法与低水平的高压分解概念结合起来,以表明我们能够从已变换的空间单向边缘估计联合密度。我们为单向边缘温度的恢复因素提供了一种新的算法,在各种合成和现实世界数据集中进行测试,并对估计的分类模型进行MAP推理。