Bayesian nonparametric hierarchical priors are highly effective in providing flexible models for latent data structures exhibiting sharing of information between and across groups. Most prominent is the Hierarchical Dirichlet Process (HDP), and its subsequent variants, which model latent clustering between and across groups. The HDP, may be viewed as a more flexible extension of Latent Dirichlet Allocation models (LDA), and has been applied to, for example, topic modelling, natural language processing, and datasets arising in health-care. We focus on analogous latent feature allocation models, where the data structures correspond to multisets or unbounded sparse matrices. The fundamental development in this regard is the Hierarchical Indian Buffet process (HIBP), which utilizes a hierarchy of Beta processes over J groups, where each group generates binary random matrices, reflecting within group sharing of features, according to beta-Bernoulli IBP priors. To encompass HIBP versions of non-Bernoulli extensions of the IBP, we introduce hierarchical versions of general spike and slab IBP. We provide explicit novel descriptions of the marginal, posterior and predictive distributions of the HIBP and its generalizations which allow for exact sampling and simpler practical implementation. We highlight common structural properties of these processes and establish relationships to existing IBP type and related models arising in the literature. Examples of potential applications may involve topic models, Poisson factorization models, random count matrix priors and neural network models
翻译:在为显示各群体之间和群体之间信息共享的潜在数据结构提供灵活模型方面,巴耶斯非参数等级前科非常有效,为显示各群体之间和不同群体间信息共享的潜在数据结构提供灵活模型,最突出的是等级式的分遣队进程(HDP)及其随后的变体,这些变体在群体间和跨群体间形成潜在集群模式。HDP可被视为更灵活的低端分遣队分配模型(LDA)的扩展,并被应用到保健领域产生的专题建模、自然语言处理和数据集等。我们侧重于类似的潜在特征分配模型,数据结构结构结构结构结构结构结构结构结构结构与J组相比,其中每个群体生成的二进制随机矩阵,根据BBT的BA前述,在群体间共享特征的扩展中,在IBBT的非贝努基矩阵模型的扩展中,我们引入了通用的分级版版版和Slab IBBT矩阵矩阵。我们为BBS的边际、海边端、海床前结构网络和常规结构模型应用提供了新的说明。