We propose a Deep Variational Clustering (DVC) framework for unsupervised representation learning and clustering of large-scale medical images. DVC simultaneously learns the multivariate Gaussian posterior through the probabilistic convolutional encoder and the likelihood distribution with the probabilistic convolutional decoder; and optimizes cluster labels assignment. Here, the learned multivariate Gaussian posterior captures the latent distribution of a large set of unlabeled images. Then, we perform unsupervised clustering on top of the variational latent space using a clustering loss. In this approach, the probabilistic decoder helps to prevent the distortion of data points in the latent space and to preserve the local structure of data generating distribution. The training process can be considered as a self-training process to refine the latent space and simultaneously optimizing cluster assignments iteratively. We evaluated our proposed framework on three public datasets that represented different medical imaging modalities. Our experimental results show that our proposed framework generalizes better across different datasets. It achieves compelling results on several medical imaging benchmarks. Thus, our approach offers potential advantages over conventional deep unsupervised learning in real-world applications. The source code of the method and all the experiments are available publicly at: https://github.com/csfarzin/DVC
翻译:我们提议了一个用于无监督的代表学习和大规模医疗图像集集集的深变式集束框架(DVC),DVC同时通过概率共振编码器和与概率共振解码器进行分布的可能性,学习多变高斯后部,并优化集束标签任务。在这里,学习多变高斯后部捕捉了大批未贴标签图像的潜在分布。然后,我们利用集群损失,在变异潜在空间的顶部进行不受监督的集束。在这个方法中,概率解码器有助于防止潜在空间数据点的扭曲,并保存数据分布的本地结构。培训进程可以被视为一个自我培训过程,以完善潜在空间,同时以迭接方式优化集集束任务。我们在代表不同医学成像模式的三个公共数据集中评估了我们提议的框架。我们的实验结果显示,我们提议的框架在不同的数据集中比较了不同的通用性。它在几个医学成像基准中取得了令人信服的结果。因此,在几个医学成像系统上,我们提出的框架提供了各种常规来源的优势。