We aim to detect and identify multiple objects using multiple cameras and computer vision for disaster response drones. The major challenges are taming detection errors, resolving ID switching and fragmentation, adapting to multi-scale features and multiple views with global camera motion. Two simple approaches are proposed to solve these issues. One is a fast multi-camera system that added a tracklet association, and the other is incorporating a high-performance detector and tracker to resolve restrictions. (...) The accuracy of our first approach (85.71%) is slightly improved compared to our baseline, FairMOT (85.44%) in the validation dataset. In the final results calculated based on L2-norm error, the baseline was 48.1, while the proposed model combination was 34.9, which is a great reduction of error by a margin of 27.4%. In the second approach, although DeepSORT only processes a quarter of all frames due to hardware and time limitations, our model with DeepSORT (42.9%) outperforms FairMOT (71.4%) in terms of recall. Both of our models ranked second and third place in the `AI Grand Challenge' organized by the Korean Ministry of Science and ICT in 2020 and 2021, respectively. The source codes are publicly available at these URLs (github.com/mlvlab/drone_ai_challenge, github.com/mlvlab/Drone_Task1, github.com/mlvlab/Rony2_task3, github.com/mlvlab/Drone_task4).
翻译:我们的目标是利用多摄像头和计算机愿景探测和识别多重物体,用于应对灾害的无人机。主要的挑战在于测试探测错误,解决身份转换和分散,适应多尺度特点和以全球摄影机运动的多重观点。提出了两个简单的办法来解决这些问题。一个是快速的多摄像系统,增加了音轨联系,另一个是采用高性能探测器和跟踪器来解决限制问题。 (......)我们的第一个方法(85.71%)的准确性比验证数据集中的FairMOT(85.44%)的基线稍有改进。在根据L2-诺姆错误计算的最后结果中,基线为48.1,而拟议的模型组合为34.9,大大减少了27.4%的差幅。在第二种办法中,虽然DeepSORT只处理由于硬件和时间限制而导致的所有框架的四分之一,但我们与DeepSOlaRT(42.9%)的模型比FairMOT(71.4%)略高于(71.4%)。我们的两个模型在“AIGrand Rib_comta”中位居第二和第三位第二位,由韩国科学部和信通技术部门/2020年/2020年的SLLI/2020年的SLLLLA 和20。