We present our solutions to the Google Landmark Challenges 2021, for both the retrieval and the recognition tracks. Both solutions are ensembles of transformers and ConvNet models based on Sub-center ArcFace with dynamic margins. Since the two tracks share the same training data, we used the same pipeline and training approach, but with different model selections for the ensemble and different post-processing. The key improvement over last year is newer state-of-the-art vision architectures, especially transformers which significantly outperform ConvNets for the retrieval task. We finished third and fourth places for the retrieval and recognition tracks respectively.
翻译:我们为Google Landmart 挑战2021提出解决方案,包括检索和识别轨道。两种解决方案都是基于分中心Arcface的变压器和ConvNet模型的组合,并带有动态边际。由于两条轨道共享相同的培训数据,我们采用了相同的编程和培训方法,但对于组合和不同后处理则采用了不同的模式选择。去年的主要改进是更新的最新最新愿景架构,特别是大大超过ConvNet的变压器,以完成检索任务。我们分别完成了第三位和第四位的检索和识别轨道。