growclusters for R is a package that estimates a partition structure for multivariate data. It does this by implementing a hierarchical version of k-means clustering that accounts for possible known dependencies in a collection of datasets, where each set draws its cluster means from a single, global partition. Each component data set in the collection corresponds to a known group in the data. This paper focuses on R Shiny applications that implement the clustering methodology and simulate data sets with known group structures. These Shiny applications implement novel ways of visualizing the results of the clustering. These visualizations include scatterplots of individual data sets in the context of the entire collection and cluster distributions versus component (or sub-domain) datasets. Data obtained from a collection of 2000-2013 articles from the Bureau of Labor Statistics (BLS) Monthly Labor Review (MLR) will be used to illustrate the R-Shiny applications. Here, the known grouping in the collection is the year of publication.
翻译:暂无翻译