The area of constrained clustering has been extensively explored by researchers and used by practitioners. Constrained clustering formulations exist for popular algorithms such as k-means, mixture models, and spectral clustering but have several limitations. A fundamental strength of deep learning is its flexibility, and here we explore a deep learning framework for constrained clustering and in particular explore how it can extend the field of constrained clustering. We show that our framework can not only handle standard together/apart constraints (without the well documented negative effects reported earlier) generated from labeled side information but more complex constraints generated from new types of side information such as continuous values and high-level domain knowledge. Furthermore, we propose an efficient training paradigm that is generally applicable to these four types of constraints. We validate the effectiveness of our approach by empirical results on both image and text datasets. We also study the robustness of our framework when learning with noisy constraints and show how different components of our framework contribute to the final performance. Our source code is available at $\href{https://github.com/blueocean92/deep_constrained_clustering}{\text{URL}}$.
翻译:限制的集群领域已经由研究人员广泛探讨,并被实践者使用。限制的集群组合配方对于k-points、混合物模型和光谱集束等流行算法来说是有限制的,但有几种限制。深层次学习的根本力量是其灵活性,在这里,我们探索了限制的集群的深层次学习框架,特别是如何扩大限制的集群领域。我们表明,我们的框架不仅能够处理由贴标签的侧面信息产生的标准/部分限制(没有早先报告的有详细记录的负面影响),而且还可以处理由诸如连续值和高水平域知识等新型侧面信息产生的更为复杂的限制。此外,我们提出了一个一般适用于这四种类型的限制的有效培训模式。我们通过在图像和文本数据集方面的实证结果来验证我们的方法的有效性。我们还研究了我们的框架在学习紧张的制约因素时的稳健性,并展示了我们框架的不同组成部分如何对最后性能作出贡献。我们的源代码可在 $hrf{https://github.com/bluue Oceceno92/deep_contranting_Croductiontext{urvlemuntrus{{{{url_sult_ur_ur_urL_Q___Q__Q__Q______________________________________________________________________________________________________我们。我们的源。我们的源。我们的源。我们的源。我们的源。我们的源码可以提供我们的源码提供我们的源码可以提供。