海湾重新标签和k-Means群集的主要方法 (pivmet: Pivotal Methods for Bayesian Relabelling and k-Means Clustering)

The identification of groups' prototypes, i.e. elements of a dataset that represent different groups of data points, may be relevant to the tasks of clustering, classification and mixture modeling. The R package pivmet presented in this paper includes different methods for extracting pivotal units from a dataset. One of the main applications of pivotal methods is a Markov Chain Monte Carlo (MCMC) relabelling procedure to solve the label switching in Bayesian estimation of mixture models. Each method returns posterior estimates, and a set of graphical tools for visualizing the output. The package offers JAGS and Stan sampling procedures for Gaussian mixtures, and allows for user-defined priors' parameters. The package also provides functions to perform consensus clustering based on pivotal units, which may allow to improve classical techniques (e.g. k-means) by means of a careful seeding. The paper provides examples of applications to both real and simulated datasets.

翻译：组群原型的识别,即代表不同组群数据点的数据集要素,可能与组群、分类和混合模型的任务有关。本文件介绍的R包样板包括从数据集中提取枢纽单位的不同方法。关键方法的主要应用之一是Markov Cain Conil Monte Carlo(MCMC)重新标签程序,以解决巴伊西亚混合模型估计中的标签切换问题。每种方法都返回后传估计,以及一组可视化输出的图形工具。包件为高山混合物提供JAGS和斯坦取样程序,并允许用户定义前代参数。包件还提供功能,根据枢纽单元进行协商一致组合,从而可以通过仔细的种子改进古典技术(例如k- means)。文件提供了实际和模拟数据集的应用实例。