Clustering is one of the fundamental tasks in computer vision and pattern recognition. Recently, deep clustering methods (algorithms based on deep learning) have attracted wide attention with their impressive performance. Most of these algorithms combine deep unsupervised representation learning and standard clustering together. However, the separation of representation learning and clustering will lead to suboptimal solutions because the two-stage strategy prevents representation learning from adapting to subsequent tasks (e.g., clustering according to specific cues). To overcome this issue, efforts have been made in the dynamic adaption of representation and cluster assignment, whereas current state-of-the-art methods suffer from heuristically constructed objectives with representation and cluster assignment alternatively optimized. To further standardize the clustering problem, we audaciously formulate the objective of clustering as finding a precise feature as the cue for cluster assignment. Based on this, we propose a general-purpose deep clustering framework which radically integrates representation learning and clustering into a single pipeline for the first time. The proposed framework exploits the powerful ability of recently developed generative models for learning intrinsic features, and imposes an entropy minimization on the distribution of the cluster assignment by a dedicated variational algorithm. Experimental results show that the performance of the proposed method is superior, or at least comparable to, the state-of-the-art methods on the handwritten digit recognition, fashion recognition, face recognition and object recognition benchmark datasets.
翻译:最近,深层集群方法(基于深层次学习的分类方法)以其令人印象深刻的业绩吸引了广泛的关注。这些算法大多结合了深度、不受监督的代表性学习和标准集群。然而,将代表学习和集群分开将导致不理想的解决办法,因为两阶段战略阻止了代表学习适应随后的任务(例如,根据具体线索进行分组)。为解决这一问题,在动态调整代表性和集群任务分配方面作出了努力,而目前最先进的方法则受到超自然构建的目标的困扰,其代表性和集群分配也得到了优化。为进一步统一分组问题,我们大胆地将集群目标设计为寻找精确的特征作为集群分配的提示。在此基础上,我们提议了一个通用的深度集群框架,将代表性学习和集群纳入一个单一的管道。为了解决这一问题,拟议框架利用了最近开发的组合化模型在学习内在特征方面的强大能力,并用最优的方式构建目标构建了代表性目标,在分配上以代表性和集群任务分配上,通过专门的缩略式算法,展示了最有可比性的业绩识别方法,即通过专门的缩算法,展示了对集群分配方式的自我识别。