Recently, many view-based 3D model retrieval methods have been proposed and have achieved state-of-the-art performance. Most of these methods focus on extracting more discriminative view-level features and effectively aggregating the multi-view images of a 3D model, but the latent relationship among these multi-view images is not fully explored. Thus, we tackle this problem from the perspective of exploiting the relationships between patch features to capture long-range associations among multi-view images. To capture associations among views, in this work, we propose a novel patch convolutional neural network (PCNN) for view-based 3D model retrieval. Specifically, we first employ a CNN to extract patch features of each view image separately. Secondly, a novel neural network module named PatchConv is designed to exploit intrinsic relationships between neighboring patches in the feature space to capture long-range associations among multi-view images. Then, an adaptive weighted view layer is further embedded into PCNN to automatically assign a weight to each view according to the similarity between each view feature and the view-pooling feature. Finally, a discrimination loss function is employed to extract the discriminative 3D model feature, which consists of softmax loss values generated by the fusion lassifier and the specific classifier. Extensive experimental results on two public 3D model retrieval benchmarks, namely, the ModelNet40, and ModelNet10, demonstrate that our proposed PCNN can outperform state-of-the-art approaches, with mAP alues of 93.67%, and 96.23%, respectively.
翻译:最近,提出了许多基于视觉的3D模型检索方法,这些方法中的大多数侧重于提取更具歧视性的视觉层面特征,并有效地将3D模型的多视图图像有效组合在一起,但是这些多视图图像之间的潜在关系没有得到充分探讨。因此,我们从利用补丁特征之间的关系来捕捉多视图图像之间的长距离关联的角度来解决这个问题。为了在这项工作中捕捉各种观点之间的关联,我们建议建立一个新的补丁网络神经网络(PCNNN),用于基于视觉的3D模型检索。具体地说,我们首先使用CNN来提取每种视图图像的补丁特征。第二,一个名为Patch Conv的新型神经网络模块旨在利用地貌空间的相邻点之间的内在关系来捕捉多视图图像之间的长距离关联。随后,一个适应加权的视图层被进一步嵌入PCNNN,以便根据每种视图特征和观点集合特征的相似性地对每一种观点进行加权。最后,我们使用歧视损失的功能来提取3D模型的补补补补特征,即以软式的模型和软式的模型的复制结果,即软式变式的模型和软式的复制。