Graph Neural Networks (GNNs) with attention have been successfully applied for learning visual feature matching. However, current methods learn with complete graphs, resulting in a quadratic complexity in the number of features. Motivated by a prior observation that self- and cross- attention matrices converge to a sparse representation, we propose ClusterGNN, an attentional GNN architecture which operates on clusters for learning the feature matching task. Using a progressive clustering module we adaptively divide keypoints into different subgraphs to reduce redundant connectivity, and employ a coarse-to-fine paradigm for mitigating miss-classification within images. Our approach yields a 59.7% reduction in runtime and 58.4% reduction in memory consumption for dense detection, compared to current state-of-the-art GNN-based matching, while achieving a competitive performance on various computer vision tasks.
翻译:图神经网络(Graph Neural Networks, GNNs)基于注意力的方法已成功应用于学习视觉特征匹配。然而,现有方法在完全图中学习,导致特征数量的二次计算复杂度。受到前一个观察到的自注意力和交叉注意力矩阵收敛于稀疏表示的启发,我们提出了ClusterGNN,这是一种注意力GNN体系结构,它基于聚类操作用于学习特征匹配任务。我们使用一个逐步聚类的模块将关键点自适应地分成不同的子图以减少冗余连接,并采用粗到细的范例减轻图像内的误分类情况。与当前最先进的基于GNN匹配方法相比,我们的方法在不同的计算机视觉任务中取得了竞争性能,同时在密集检测中减少了59.7%的运行时间和58.4%的内存消耗。