Intra-class variations in the open world lead to various challenges in classification tasks. To overcome these challenges, fine-grained classification was introduced, and many approaches were proposed. Some rely on locating and using distinguishable local parts within images to achieve invariance to viewpoint changes, intra-class differences, and local part deformations. Our approach, which is inspired by P2P-Net, offers an end-to-end trainable attention-based parts alignment module, where we replace the graph-matching component used in it with a self-attention mechanism. The attention module is able to learn the optimal arrangement of parts while attending to each other, before contributing to the global loss.
翻译:为了克服这些挑战,引入了细细分类,并提出了许多办法。有些依靠在图像中定位和使用可辨别的地方部分,以实现视变、等内差异和局部部分变形的变形。我们受P2P-Net启发的方法提供了一个端到端可训练的注意部件调整模块,我们用一个自我注意机制取代其中使用的图形匹配部分。注意模块能够在帮助全球损失之前学习部分的最佳安排,同时相互配合。