We propose a nonparametric factorization approach for sparsely observed tensors. The sparsity does not mean zero-valued entries are massive or dominated. Rather, it implies the observed entries are very few, and even fewer with the growth of the tensor; this is ubiquitous in practice. Compared with the existent works, our model not only leverages the structural information underlying the observed entry indices, but also provides extra interpretability and flexibility -- it can simultaneously estimate a set of location factors about the intrinsic properties of the tensor nodes, and another set of sociability factors reflecting their extrovert activity in interacting with others; users are free to choose a trade-off between the two types of factors. Specifically, we use hierarchical Gamma processes and Poisson random measures to construct a tensor-valued process, which can freely sample the two types of factors to generate tensors and always guarantees an asymptotic sparsity. We then normalize the tensor process to obtain hierarchical Dirichlet processes to sample each observed entry index, and use a Gaussian process to sample the entry value as a nonlinear function of the factors, so as to capture both the sparse structure properties and complex node relationships. For efficient inference, we use Dirichlet process properties over finite sample partitions, density transformations, and random features to develop a stochastic variational estimation algorithm. We demonstrate the advantage of our method in several benchmark datasets.
翻译:我们建议对观测偏少的粒子采用非参数化的因子法。宽度并不意味着零值的条目是大规模或支配性的。相反,它意味着观测到的条目很少,甚至更少;实际上,这是无处不在的;与现有的作品相比,我们的模型不仅利用观测到的进入指数背后的结构性信息,而且提供额外的解释性和灵活性。它可以同时估计一套关于高压节点内在特性的定位因素,以及反映其在与他人互动中外向活动的一套可感性因素;用户可以自由选择两种因素之间的权衡。具体地说,我们使用等级性Gamma进程和Poisson随机性措施来构建一个有色值的进程,与现有的工作相比,我们的模式不仅可以自由地抽样两种类型的因素来生成进源指数,而且还能保证一个无趣的可理解性。然后,我们把感知的变异度进程标准化,然后用高分解进程来抽样输入输入进入的条目值,作为两种因素之间的非线性变。我们使用等级性基变的变法,这样可以随意地测量各种变的特性,从而捕捉取各种变缩缩的特性。