Image segmentation is a fundamental task in computer vision. Data annotation for training supervised methods can be labor-intensive, motivating unsupervised methods. Some existing approaches extract deep features from pre-trained networks and build a graph to apply classical clustering methods (e.g., $k$-means and normalized-cuts) as a post-processing stage. These techniques reduce the high-dimensional information encoded in the features to pair-wise scalar affinities. In this work, we replace classical clustering algorithms with a lightweight Graph Neural Network (GNN) trained to achieve the same clustering objective function. However, in contrast to existing approaches, we feed the GNN not only the pair-wise affinities between local image features but also the raw features themselves. Maintaining this connection between the raw feature and the clustering goal allows to perform part semantic segmentation implicitly, without requiring additional post-processing steps. We demonstrate how classical clustering objectives can be formulated as self-supervised loss functions for training our image segmentation GNN. Additionally, we use the Correlation-Clustering (CC) objective to perform clustering without defining the number of clusters ($k$-less clustering). We apply the proposed method for object localization, segmentation, and semantic part segmentation tasks, surpassing state-of-the-art performance on multiple benchmarks.
翻译:在计算机视野中,图像分割是一项基本任务。 培训监督方法的数据批注可以是劳动密集型的,鼓励不受监督的方法。 一些现有的方法可以从培训前的网络中提取深度特征,并建立一个图表,作为后处理阶段,应用典型的集群方法(例如,美元汇率和标准截分法)作为后处理阶段。 这些技术可以将特征中编码的高维信息减少为双向的伸缩相近性。 在这项工作中,我们用一个经过培训的轻量图形神经网络(GNN)来取代典型的集群算法,以达到相同的集群目标功能。 然而,与现有方法相比,我们不仅向GNNN提供当地图像特征之间双向的近似性,而且还提供原始特征本身的图形。 保持原始特性和组合目标之间的这种连接可以默认部分的语义分割,而不需要额外的后处理步骤。 我们展示了如何将典型的集群目标发展成为自超标的损失函数,用于培训我们的图像分割 GNNN。 此外,我们使用CREC- Clustertersteriscrestering (CC$) 目标, 用于不界定本地的多段组合组合, 目标, 用于执行部分组合,但不定义区域分类,,但不定义区域分类, 用于计算。