In this report, we introduce our winning solution to the Real-time 3D Detection and also the "Most Efficient Model" in the Waymo Open Dataset Challenges at CVPR 2021. Extended from our last year's award-winning model AFDet, we have made a handful of modifications to the base model, to improve the accuracy and at the same time to greatly reduce the latency. The modified model, named as AFDetV2, is featured with a lite 3D Feature Extractor, an improved RPN with extended receptive field and an added sub-head that produces an IoU-aware confidence score. These model enhancements, together with enriched data augmentation, stochastic weights averaging, and a GPU-based implementation of voxelization, lead to a winning accuracy of 73.12 mAPH/L2 for our AFDetV2 with a latency of 60.06 ms, and an accuracy of 72.57 mAPH/L2 for our AFDetV2-base, entitled as the "Most Efficient Model" by the challenge sponsor, with a winning latency of 55.86 ms.
翻译:在本报告中,我们引入了对实时3D探测的胜利解决方案,以及2021年CVPR Waymo公开数据集挑战中的“最高效模型”。从去年获奖模型AFDet的AFDet中,我们对基准模型做了几处修改,以提高准确性,同时大大缩短延缓度。名为AFDetV2的修改模型有3D立体立体提取器,改进的RPN,扩大的可接收字段和增加的产生IOU-aware信心分数的子头。这些模型的增强,加上强化数据、平均随机重和基于GPU的蒸气化实施,导致我们的AFDetV2的精确度为73.12兆帕/L2,拉长为60.06毫秒,我们的AFDetV2-Base的精确度为72.57兆帕/L2,称为“最高效模型”,由挑战赞助人称为“最高效模型”,获得55.86毫升。