State-of-the-art approaches for clustering high-dimensional data utilize deep auto-encoder architectures. Many of these networks require a large number of parameters and suffer from a lack of interpretability, due to the black-box nature of the auto-encoders. We introduce Mixture Model Auto-Encoders (MixMate), a novel architecture that clusters data by performing inference on a generative model. Derived from the perspective of sparse dictionary learning and mixture models, MixMate comprises several auto-encoders, each tasked with reconstructing data in a distinct cluster, while enforcing sparsity in the latent space. Through experiments on various image datasets, we show that MixMate achieves competitive performance compared to state-of-the-art deep clustering algorithms, while using orders of magnitude fewer parameters.
翻译:高维数据群集的最新方法使用深层自动编码器结构。 许多这些网络需要大量参数,并且由于自动编码器的黑盒性质而缺乏可解释性。 我们引入了混合模型自动编码器(MixMate),这是一个通过对基因化模型进行推理而集中数据的新型结构。 从稀疏字典学习和混合模型的角度来看, MixMate 由若干自动编码器组成,每个网络的任务是在不同的集群中重建数据,同时在暗层中加强宽度。我们通过对各种图像数据集的实验,显示混合在使用数量较少的参数的同时,取得了与最先进的深度群集算法相比的竞争性性能。