The Three-River-Source region is a highly significant natural reserve in China that harbors a plethora of untamed botanical resources. To meet the practical requirements of botanical research and intelligent plant management, we construct a large-scale dataset for Plant detection in the Three-River-Source region (PTRS). This dataset comprises 6965 high-resolution images of 2160*3840 pixels, captured by diverse sensors and platforms, and featuring objects of varying shapes and sizes. Subsequently, a team of botanical image interpretation experts annotated these images with 21 commonly occurring object categories. The fully annotated PTRS images contain 122, 300 instances of plant leaves, each labeled by a horizontal rectangle. The PTRS presents us with challenges such as dense occlusion, varying leaf resolutions, and high feature similarity among plants, prompting us to develop a novel object detection network named PlantDet. This network employs a window-based efficient self-attention module (ST block) to generate robust feature representation at multiple scales, improving the detection efficiency for small and densely-occluded objects. Our experimental results validate the efficacy of our proposed plant detection benchmark, with a precision of 88.1%, a mean average precision (mAP) of 77.6%, and a higher recall compared to the baseline. Additionally, our method effectively overcomes the issue of missing small objects. We intend to share our data and code with interested parties to advance further research in this field.
翻译:三江源地区是中国一个极具意义的自然保护区,拥有丰富的未被开发的植物资源。为了满足植物研究和智能化植物管理的实际需求,我们构建了一个用于三江源植物检测的大规模数据集(PTRS)。该数据集包含来自于不同传感器和平台的6965个高分辨率图像,特征包括形状和大小各异的物体。随后,由一组植物图像解释专家对这些图像进行了21个常见物体类别的注释。完全注释的PTRS图像包含122,300个植物叶片实例,每个实例都被水平矩形标记。PTRS对我们提出了密集的遮挡、不同叶片分辨率和植物间高度相似的特征等挑战,促使我们开发了一种名为PlantDet的新型物体检测网络。该网络采用基于窗口的高效自注意力模块(ST块)在多个尺度上生成强大的特征表示,提高了小型和密集遮挡物体的检测效率。我们的实验结果验证了我们提出的植物检测基准的有效性,准确率为88.1%,平均精度(mAP)为77.6%,并且与基线相比具有更高的召回率。此外,我们的方法有效地解决了漏检小目标的问题。我们打算将我们的数据和代码与感兴趣的人分享,以促进该领域的进一步研究。