Bayesian hierarchical mixture clustering (BHMC) improves on the traditional Bayesian hierarchical clustering by, with regard to the parent-to-child diffusion in the generative process, replacing the conventional Gaussian-to-Gaussian (G2G) kernels with a Hierarchical Dirichlet Process Mixture Model (HDPMM). However, the drawback of the BHMC lies in the possibility of obtaining trees with comparatively high nodal variance in the higher levels (i.e., those closer to the root node). This can be interpreted as that the separation between the nodes, particularly those in the higher levels, might be weak. We attempt to overcome this drawback through a recent inferential framework named posterior regularization, which facilitates a simple manner to impose extra constraints on a Bayesian model to address its weakness. To enhance the separation of clusters, we apply posterior regularization to impose max-margin constraints on the nodes at every level of the hierarchy. In this paper, we illustrate the modeling detail of applying the PR on BHMC and show that this solution achieves the desired improvements over the BHMC model.
翻译:传统贝叶斯等级混合群(BHMC)改善了传统的贝叶斯等级群群群,其方法是:在基因过程的亲子扩散方面,用一个等级分立进程混合模型(HDPMM)取代传统的高山到高原(G2G)内核,取代传统的高山到高原(G2G)内核,但BHMC的缺点在于:在较高层次(即更接近根节点的树木)获得树木的高度点差异相对较高(即最接近根节点的树木),这可以被解释为节点之间的分离可能较弱。我们试图通过最近一个称为后院化的推论框架来克服这一退步,这一框架便于对巴伊斯模式施加额外的限制,以解决其弱点。为了加强集群的分离,我们采用后方规范,对各级节点(即更接近根节点的树木)施加最大点限制。在本文中,我们举例说明了对BHHMC模型应用最大点的细节,并表明这一解决办法在BHMC上取得了预期的改进。