Not everyone is wealthy enough to have hundreds of GPUs or TPUs. Therefore, we've got to find a way out. In this paper, we introduce a data-efficient instance segmentation method we used in the 2021 VIPriors Instance Segmentation Challenge. Our solution is a modified version of Swin Transformer, based on the mmdetection which is a powerful toolbox. To solve the problem of lack of data, we utilize data augmentation including random flip and multiscale training to train our model. During inference, multiscale fusion is used to boost the performance. We only use a single GPU during the whole training and testing stages. In the end, our team named THU_IVG_2018 achieved the result of 0.366 for AP@0.50:0.95 on the test set, which is competitive with other top-ranking methods while only one GPU is used. Besides, our method achieved the AP@0.50:0.95 (medium) of 0.592, which ranks second among all contestants. In the end, our team ranked third among all the contestants, as announced by the organizers.
翻译:并非每个人都有足够的财富来拥有数百个 GPU 或 TPU 。 因此, 我们必须找到一条出路 。 在本文中, 我们引入了一种数据效率高的试样分解方法 。 在2021 VIPR 分解挑战中, 我们的解决方案是修改的 Swin 变异器版本, 其基础是毫米检测器, 它是一个强大的工具箱。 为了解决缺少数据的问题, 我们使用数据增益, 包括随机翻转和多尺度的培训来培训模型。 在推断过程中, 多尺度的聚合用于提升性能。 我们仅在整个培训和测试阶段使用一个 GPU 。 最后, 我们的团队在测试集成上取得了0. 366 AP@ 0. 50: 0. 95 的结果, 它与其他顶级方法相比具有竞争力, 而只使用了一个 GPUP 。 此外, 我们的方法达到了 AP@ 0. 50: 0. 50: 0. 95 (中等), 在所有参赛者中排名第二。 最后, 我们的团队排第三, 正如组织者所宣布的那样, 。